Saturday, November 21, 2020

Setting up Azure DevOps Easy or Right

I'm still somewhat new to the Azure world and one thing that something you'll only learn from failing is that there's setting up Azure DevOps easy or right

Typically like so many Microsoft products Microsoft likes to provide easy solutions, well you click, something searches and then you select an option and like magic it connects. This is how a service principal (automatic) connection works. 

Don't ever Fucking do it

This is guaranteed to cause you more tears and angst with ARM and possibly other pipelines for deployment than any single shortcut on the planet. The problem is it doesn't alert you to various missing pieces. For starters you'll have contributor level access to a subscription, but there are some tasks he running infrastructure building you'll need to have Owner privileges. It won't grant you directory read to the AAD. You'll have no idea why somethings fail, they do and that's all the knowledge you'll have.

I'm sure there are a few extra things which no doubt you'll blindly stagger around like a 21-year-old on the night club dance floor at 2 am wondering why it's not working the 21 year old is thinking the same thing when he's trying to meet that girl. You'll feel about as awesome as well while you wondered why stuff doesn't work.

Enjoy your days building Azure infrastructure with ARM. Infrastructure as code is awesome but don't let this bite you on the arse

Friday, July 26, 2019

Connection Strings for VBA and MSOLEDB

They recently gave me a gig, which amongst setting up some databases included fixing up some spreadsheets to talk to the new database. This included fixing connection strings in VBA.

Background to the story the client wanted to move some systems from using Oracle DB as a back end to SQL Server as the data store.

Now two years ago I had never written from scratch a VBA script for Office Automation. It has now become a bit a skill set I have developed. I am not a fan of some of what I see. Not the code it’s good. More people have built a complete application using VBA and Excel. This is bad because it doesn’t deal well with governance. Potentially has security issues in that a username and password are available in the sheet,. Connection details including a host name are all potentially at risk.

On to the bit, you came to get. Many of you are aware Microsoft now endorsed OLEDB as the future. I am not going into the why, but it means you now have a new driver. We can find more info and the driver file from this page. Grab the driver and get it installed. Now the old OLE driver used the SQLOLEDB provider in connection strings long with the SQLNCLI driver are still deprecated. The new driver is using the MSOLEDBSQL Provider.

I needed to update stings, and I was needing to form the string based on limited information, wasn’t helping it’s not listed there as yet.  Noone else was showing details of a connection string to an Azure DB which was what I wanted to do.

I have access to the new VBA book “Microsoft Excel 2019 VBA and Macros, First Edition”, it disappointed me that the book didn’t show usage of the new driver, it has a copyright date of 2019 and they had released the driver in 2018

I will save you the hassle of finding this out for yourself. Here is the big reveal:

Dim UserName As String
Dim Passwd As String
Dim PType As String
Dim SqlString As String
Dim ReturnArray

Dim sConnectionString As String
Dim con As ADODB.Connection
Dim rst As ADODB.Recordset
Set con = New ADODB.Connection
Set rst = New ADODB.Recordset

 Hostname = ""
 InitCatalog = "Adventureworks"
 UserName = "dbtest@myytestdb"
 Passwd = "somecrazychars" 
 Trusted = "yes"

 sConnectionString = "Provider = " & PType & "; " _
                        & "Data Source = " & Hostname & "; " _
                        & "Initial Catalog = " & InitCatalog & "; " _
                       & "User ID = " & UserName & "; " _
                        & "Password = " & Passwd & "; " _
                        & "Trusted_Connection = " & Trusted
SqlString = "SELECT TOP (5) *   FROM [SalesLT].[Customer] "
con.ConnectionString = sConnectionString
rst.Open SqlString, con
ReturnArray = rst.GetRows

Also, don't forget about the issues with non-alphanumreic chars. Check out the Strong Passwords

Ok, Enjoy.

See ya round


Saturday, September 08, 2018

A Backdoor for one is a Backdoor for all

The Australian Government wants to put a backdoor into your apps. They are trying to put all sorts of spin on the idea to make you feel like they will have some sort of control and its only for them. This is to stop the bad guys, terrorists, someone they don't like, you know those people.

Let me get this straight, there are only a few ways to enable this. 
Put a flaw in the encryption.
Add something like a keylogger to the device.
Allow access by putting a bug in the app

There are more but you will get the point soon enough to understand that this is not going to create good outcomes for the consumer. 

Let's try and make some sense to you as a consumer of each of these approaches. Hopefully, you will have a better understanding of the problem. Then understand these governments actions have an impact on the broader market. Their proposals will kill the internet in their overzealous approach to "catching the bad guys"

Put a flaw in the encryption

Many apps like Telegram have made a strong point in their selling of why users would want to use their app. Encryption secures your messages. Encryption happens between you and the other you are communicating with som you can safely send information. 

Most of the internet websites use SSL or TLS. These are standards for encryption so that when you type into your browser to enter your banking password it stays between you and the bank. If encryption is broken for anything it has to break encryption for everything. This includes access to your bank. Researchers would publish the flaw in the algorithm for encryption which would force the app's removal. 

Apps will progressively dry up until we are no longer able to securely send anything. Oh hang on but I can connect to a website and https will mean my communication is protected. Yes, well maybe. You see it is now a slippery slope. When it finally means that I cannot log on to my bank or government site to conduct business the internet as we know it has ground to a halt. No Facebook, no Instagram, no Gmail, sorry it's all gone?

Installing a Keylogger

Wow, where do you start! This is what the bad guys try and do every day. Why because it captures all your keystrokes. Literally, everything you type including everything I typed into this blog. My username and password for my banking. Logging into Amazon to buy a book or other things. Every keystroke is sent off to the endpoint where the keylogger is sending the data. Two problems, can someone intercept it? Can I access the place this is going? There will be more issues but let's start with these.

Hackers will do their darnedest to implement their own tools on your computer to capture the output of the keylogger. I guarantee that they will succeed.

The endpoint. That is some great repository of all the data from everyone's devices, yes everyone. That is a copy of every keystroke.
First is storage, we create terabytes of data each day globally. That will require some significant tech to store all that data and significant cost to taxpayers. Second, the endpoint will have to have better than world-class developers and security teams to secure this data. Look above at what is being stored everything you need to cause identity crises for much of that country. 
As a Hacker, these would be golden treasure troves. The value of the repository would be immense to the bad guys. The ability to lure people to do any number of things will have zero monetary bounds. You want $10million to do that, sure, hell I would have paid $50million.

A bug in the app 

Those who have reason to will almost immediately find the bug. These might be enemies of your state (your country) and well just about anyone who might want to commit an offence against you personally. It can't be there just for the government spying agency as that isn't how these things work. All sorts of people will either be screaming for a fix for the bug or quietly extracting your data. How will you know if the app has a bug? Publishing details are the only way you will ever know. 

The Australian government has placed penalties of 10 years custody in their proposed legislation. This is to stop good people telling you that your apps developer has developed a bug to allow the government to spy. Even reporting on the fact you found all sorts of data or other relevant things on the dark web would leave you open to prosecution. 

See ya round


Thursday, July 26, 2018

Anaconda, Jupyter labs, Java and Broken Stuff

The premise

I decided to have a look over Jupyter Labs as part of some professional development. This was to start learning and getting some skills in Jupyter.

I decided to use the Safari education provided through ACM. A seemingly good starting point was Jupyter: Interactive Computing with Jupyter Notebook. This turns out to be a fast-paced set of videos using Jupyter Notebook which is slightly out date with Jupyter Labs. It's a broad brush and introduces a lot of concepts in quick succession.

What did I find

Jupyter Labs is exciting and looks like a great way forward in the Jupyter world. Following along the course I went through a few examples and installed the extra components up to Scala (which is failing on Ubuntu 18). The author mentions a few problems with the maturity of installing on Windows and I'd say there's work to do in the Linux world. Will follow up with a demo of installing all this.

Getting to complete a variety of visualisations in various notebooks including python, R and Julia is a good place to get started. The course due to its breadth is light on specific content in any area. It helps get across a broad understanding of the of Jupyter.

There's also a brief introduction to Jupyter Hubs and the use of Docker with Jupyter.

The problems I ran into

As I mentioned above Scala won't currently install on Ubuntu 18 without a few tweaks. This is due to a bug with cacerts with the fix is well underway and should soon be out(24/07/2018)

JavaScript in labs is broken due to an issue with extensions. This same issue is causing other problems with the use of widgets.
There is also a lot of hit and miss with getting all the relevant packages installed to support all Jupyter Labs functionality

On this Learning Path on Safari

This is a fast-paced introduction, it's skimming just above the surface Jupyter. You will learn a broad swwep of Jupyter. Labs is different so you will have to work a bit out yourself. There's very little to teach you about how to formulate a notebook – rules guidance etc. While interesting the course does provide an overview. There is some opportunity to execute some code but, the fact you need to own the book to access any files makes it less than fantastic. This is not so much a hit at the content, but how Safari must be licensing this Packt content

Thursday, July 12, 2018

Yum failing error 14 no more mirrors

If you encounter [Errno 14] when trying to execute yum commands then you've an out of date cache.

I just booted up a Centos VM it hadn't been used for around 3 months and was trying to install Docker. The VM would throw this and not install items.

[root@dataserver1 ~]# yum install docker
Loaded plugins: fastestmirror, langpacks
Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast [Errno 14] HTTP Error 404 - Not Found
Trying other mirror.
To address this issue please refer to the below knowledge base article

If above article doesn't help to resolve this issue please create a bug on [Errno 14] HTTP Error 404 - Not Found
Trying other mirror.

It's a simple fix simply execute

yum clean all
yum makecache fast

It will rebuild your repo cache and have proper details for latest repo mirrors

See ya round


Saturday, July 07, 2018

SQL Server Buffer Cache Hit Ratio is dead

I was recently working with a  client on a problem they'd with an application. The client requested someone have a look over the database and see if I could find anything which the internal staff might have overlooked.

I started browsing around in the monitoring and noticed an anomaly. The database was showing an increase in IO Wait and nothing else. The monitoring tool hadn't alerted, though, in an advisor suggested to review IO Wait. Unfortunately, overlooking these measures was missing the problem. The general health of the database instance, CPU, memory, the number of connections etc were all rather benign. This along with a high BCHR and this lead people to come to the conclusion that the database server wasn't struggling and hence it was ok.

I've seen a quite a few articles in the SQL Server world dissing wait states and claiming BCHR as the tool of measures to DB health and yet here we're the exact thing Method R says pointed to an issue.

There's a great example over at SimpleTalk on how to show why BCHR isn't the greatest of measures. A database might be performing great work with only a BCHR of 65%. Individual circumstances will only tell you if that's right in a system. As the article at SimpleTalk discussed the idea that a system could be exhibiting memory pressure and yet still show a high BCHR.

When you search for Buffer Cache Hit Ratio tuning you'll find an article by Joe Chang and it, unfortunately, muddies the water on what reducing LIO was about. The thing is that Method R is about tuning out wait time and that it's highly applicable to modern SQL Server.

When you look at some of the graphs from the system you have Something that looked like this when the problem was in existence.

 What you can see here is a graph of various waits and the underlying area of IO wait in the blue.
This graph may not look that spectacular, nothing stands out, and that's why it might be easy to overlook. The balance was that the IO Wait was 30% and higher as a percentage of wait time. Other data was showing wait times of seconds. Now looking at the other measures of CPU, 20% to 40% random spikes in usage, nothing there. BCHR always above 90% and mostly above 99%. I then thought I'd overlooked something.

We investigated several potential slow queries identified due to logs of software using this database. Work found the need to fix one index. A few others went on the backlog for future work, but still no performance. I'd to come back to why was it getting such high IO. We'd optimised queries, we didn't have other performance issues, but we'd higher than seemed reasonable IO Wait time.

Next step was to get diskspd and run it on the drive where the data file was for the database. The results are below This server had dual 8GB HBA channels to the SAN. If you look down in the second section the total MB/s throughput is around 48MB/s possible half of a single 8GB channel. Also look at the 99th percentile. For those not quite up on their statistics, here I go. 1 out of 100 queries diskspd made was taking 32ms to read and 1 out 100 16 ms to write. Unlucky you if you got the 100th of both at the same time

Command Line: diskspd -b8K -d30 -o8 -t8 -h -r -w35 -L -Z1G -c50G G:\IOTest\iotestR1T1.dat

Input parameters:

timespan:   1
duration: 30s
warm up time: 5s
cool down time: 0s
measuring latency
random seed: 0
path: 'G:\IOTest\iotestR1T1.dat'
think time: 0ms
burst size: 0
software cache disabled
hardware write cache disabled, writethrough on
write buffer size: 1073741824
performing mix test (read/write ratio: 65/35)
block size: 8192
using random I/O (alignment: 8192)
number of outstanding I/O operations: 8
thread stride size: 0
threads per file: 8
using I/O Completion Ports
IO priority: normal

Results for timespan 1:

actual test time: 30.00s
thread count: 8
proc count: 8

CPU |  Usage |  User  |  Kernel |  Idle
   0|  19.66%|  13.99%|    5.67%|  80.34%
   1|  40.67%|  11.08%|   29.59%|  59.33%
   2|  19.71%|  11.86%|    7.85%|  80.29%
   3|  21.16%|  15.91%|    5.25%|  78.84%
   4|  25.33%|  17.42%|    7.90%|  74.67%
   5|  20.59%|  13.36%|    7.23%|  79.41%
   6|  22.36%|  14.04%|    8.32%|  77.64%
   7|  25.79%|  13.47%|   12.32%|  74.21%
avg.|  24.41%|  13.89%|   10.52%|  75.59%

Total IO
thread |       bytes     |     I/Os     |    MiB/s   |  I/O per s |  AvgLat  | LatStdDev |  file
     0 |       191004672 |        23316 |       6.07 |     777.18 |   10.291 |     6.838 | G:\IOTest\iotestR1T1.dat (50GiB)
     1 |       187531264 |        22892 |       5.96 |     763.05 |   10.480 |     8.466 | G:\IOTest\iotestR1T1.dat (50GiB)
     2 |       190963712 |        23311 |       6.07 |     777.01 |   10.293 |     6.858 | G:\IOTest\iotestR1T1.dat (50GiB)
     3 |       188301312 |        22986 |       5.99 |     766.18 |   10.438 |     8.153 | G:\IOTest\iotestR1T1.dat (50GiB)
     4 |       188792832 |        23046 |       6.00 |     768.18 |   10.410 |     7.029 | G:\IOTest\iotestR1T1.dat (50GiB)
     5 |       192118784 |        23452 |       6.11 |     781.71 |   10.232 |     6.327 | G:\IOTest\iotestR1T1.dat (50GiB)
     6 |       191086592 |        23326 |       6.07 |     777.51 |   10.284 |     6.488 | G:\IOTest\iotestR1T1.dat (50GiB)
     7 |       191832064 |        23417 |       6.10 |     780.55 |   10.246 |     6.587 | G:\IOTest\iotestR1T1.dat (50GiB)
total:        1521631232 |       185746 |      48.37 |    6191.39 |   10.333 |     7.126

Read IO
thread |       bytes     |     I/Os     |    MiB/s   |  I/O per s |  AvgLat  | LatStdDev |  file
     0 |       124010496 |        15138 |       3.94 |     504.59 |   11.605 |     7.852 | G:\IOTest\iotestR1T1.dat (50GiB)
     1 |       121675776 |        14853 |       3.87 |     495.09 |   11.788 |     9.912 | G:\IOTest\iotestR1T1.dat (50GiB)
     2 |       123576320 |        15085 |       3.93 |     502.82 |   11.627 |     7.880 | G:\IOTest\iotestR1T1.dat (50GiB)
     3 |       122241024 |        14922 |       3.89 |     497.39 |   11.773 |     9.524 | G:\IOTest\iotestR1T1.dat (50GiB)
     4 |       122519552 |        14956 |       3.89 |     498.52 |   11.778 |     8.101 | G:\IOTest\iotestR1T1.dat (50GiB)
     5 |       124936192 |        15251 |       3.97 |     508.35 |   11.538 |     7.227 | G:\IOTest\iotestR1T1.dat (50GiB)
     6 |       123412480 |        15065 |       3.92 |     502.15 |   11.619 |     7.431 | G:\IOTest\iotestR1T1.dat (50GiB)
     7 |       123387904 |        15062 |       3.92 |     502.05 |   11.590 |     7.604 | G:\IOTest\iotestR1T1.dat (50GiB)
total:         985759744 |       120332 |      31.34 |    4010.97 |   11.664 |     8.237

Write IO
thread |       bytes     |     I/Os     |    MiB/s   |  I/O per s |  AvgLat  | LatStdDev |  file
     0 |        66994176 |         8178 |       2.13 |     272.59 |    7.860 |     3.170 | G:\IOTest\iotestR1T1.dat (50GiB)
     1 |        65855488 |         8039 |       2.09 |     267.96 |    8.062 |     3.687 | G:\IOTest\iotestR1T1.dat (50GiB)
     2 |        67387392 |         8226 |       2.14 |     274.19 |    7.845 |     3.182 | G:\IOTest\iotestR1T1.dat (50GiB)
     3 |        66060288 |         8064 |       2.10 |     268.79 |    7.968 |     3.497 | G:\IOTest\iotestR1T1.dat (50GiB)
     4 |        66273280 |         8090 |       2.11 |     269.66 |    7.882 |     3.092 | G:\IOTest\iotestR1T1.dat (50GiB)
     5 |        67182592 |         8201 |       2.14 |     273.36 |    7.802 |     2.876 | G:\IOTest\iotestR1T1.dat (50GiB)
     6 |        67674112 |         8261 |       2.15 |     275.36 |    7.850 |     2.998 | G:\IOTest\iotestR1T1.dat (50GiB)
     7 |        68444160 |         8355 |       2.18 |     278.49 |    7.823 |     2.872 | G:\IOTest\iotestR1T1.dat (50GiB)
total:         535871488 |        65414 |      17.03 |    2180.42 |    7.886 |     3.182

  %-ile |  Read (ms) | Write (ms) | Total (ms)
    min |      0.220 |      1.210 |      0.220
   25th |      5.568 |      5.910 |      5.742
   50th |      9.924 |      7.630 |      8.435
   75th |     16.694 |      9.396 |     13.386
   90th |     21.264 |     10.991 |     19.469
   95th |     24.205 |     12.038 |     22.507
   99th |     32.656 |     16.721 |     30.466
3-nines |     53.280 |     34.684 |     49.527
4-nines |    151.592 |     70.318 |    113.512
5-nines |    530.571 |    110.152 |    530.571
6-nines |    596.914 |    110.152 |    596.914
7-nines |    596.914 |    110.152 |    596.914
8-nines |    596.914 |    110.152 |    596.914
9-nines |    596.914 |    110.152 |    596.914
    max |    596.914 |    110.152 |    596.914

This led to further investigation by other people to find another database which had recently had a change made which showed it running at up to 798Mb/s over its port. These two databases shared a common port on some storage middleware above the disk servers for the SAN.

A couple of takeaways
  1. Assume nothing
  2. Test your theories
  3. Gather evidence
  4. Ask questions
  5. Challenge perceptions
End result the troubling database had its IO access moved to take it away from further interaction with the database which I'd been asked to investigate the performance slowness.
Another thing if you have slowness and you're trying to isolate the source it might take differently worded questions, "how is the database?" might not yield the answer you want. Perhaps try "What is happening in the database when an incident is happening and is that good or bad?"

Good luck with tuning

See ya round


Sunday, April 08, 2018

What happened at Facebook with Cambridge Analytica

I see a few people asking who don't really understand the technology of data and marketing and what happened with Cambridge Analytica.

Firstly there is the right or legit way Facebook uses and sells data for marketing purposes.

As a general rule Facebook provides aggregated data to advertisers, so if you lived in Oxfordshire in the UK, then Facebook aggregates all data of people who it identifies as living in Oxfordshire. It would then in accordance with general marketing data rules (this is a sample size, not the legal rule)  they would remove some data as it easily identifies a person which risks further identifications of actual people and then bundle up some insights either as a data set or through their own software for marketers. Cambridge Analytica is different in two ways.

First, they directly engaged with US peoples accounts on facebook using an app. I won't call out any, however, you can be reasonably assured any app on Facebook isn't there for your enjoyment, it is to get at your data. Cambridge Analytica simply created an app, had people engage with the app. It might have been guessing my age or funny captions for a photo, but it reeled in a significant number of people. When you connected to that app it would have asked for certain permissions. It now officially had access to whatever data you agreed to share. Here is where the game changed. Cambridge Analytica had realised that due to a bug that they could access the data not only what you had provided for, but additional data and just about as much from your friends. Now as many of us who don't live in the US have friends in the US. Cambridge Analytica did not discriminate, they simply sucked down anything within their ability to access and took all that data into a pond of their own making.

A couple of things, unless you sent credit card details via a messenger post to someone there is little risk of credit card data being there. If your full home address is there, then you are at risk of having that in the pond of data Cambridge Analytica gathered.
Why did they go to that trouble and what was the outcome.
Without providing a lesson in statistics, there are plenty of them freely available on the web, the whole thing was to identify what your associations are and therefore profile you, particularly if you reside in the US, I am betting they did it for everyone with varying degrees of success if they have your data. Why??
If you liked a post about a lot of environmental issues it means you might not like people taking down environmental protection rules. How do I convince you that the person who is planning taking down the environmental laws is the good guy I make sure that I provide you information that shows why the current rules favour someone who is not you(you are missing out). This is the sneaky way in which they targeted people with ads or fake news items which were to alter peoples perception. If you can create cognitive dissonance with someone you have a reasonable chance of changing their mind.
This has been an ongoing trend in businesses that a customers data is more important than the customer as once they have it they have it potentially for life.

GDPR will put pressure on operators in Europe and the rest of the world need to follow the European standard. Countries which don't have data provisions similar to GDPR need to start moving or expect to be annihilated at the next elections for failing to protect their citizens from such unruly behaviour and such a disregard for users data.

To recap there was a breach in the way the facebook was being accessed through the provided interface. This allowed Cambridge Analytica to access far more data than they ever should have been allowed to.
This was a major failure in Facebook engineering and their privacy and security practices.
This data then appears to have been misused to target ads to people. It is quite possible someone you sit next to at work was getting a very different political message than you and your friends with a similar group being other cohorts from your workplace and possibly user group you share membership of.

Facebook is in hot water in a variety of countries right now including the hearing in Congress next week.

See ya round