Saturday, September 08, 2018

A Backdoor for one is a Backdoor for all

The Australian Government wants to put a backdoor into your apps. They are trying to put all sorts of spin on the idea to make you feel like they will have some sort of control and its only for them. This is to stop the bad guys, terrorists, someone they don't like, you know those people.

Let me get this straight, there are only a few ways to enable this. 
Put a flaw in the encryption.
Add something like a keylogger to the device.
Allow access by putting a bug in the app


There are more but you will get the point soon enough to understand that this is not going to create good outcomes for the consumer. 

Let's try and make some sense to you as a consumer of each of these approaches. Hopefully, you will have a better understanding of the problem. Then understand these governments actions have an impact on the broader market. Their proposals will kill the internet in their overzealous approach to "catching the bad guys"

Put a flaw in the encryption

Many apps like Telegram have made a strong point in their selling of why users would want to use their app. Encryption secures your messages. Encryption happens between you and the other you are communicating with som you can safely send information. 

Most of the internet websites use SSL or TLS. These are standards for encryption so that when you type into your browser to enter your banking password it stays between you and the bank. If encryption is broken for anything it has to break encryption for everything. This includes access to your bank. Researchers would publish the flaw in the algorithm for encryption which would force the app's removal. 

Apps will progressively dry up until we are no longer able to securely send anything. Oh hang on but I can connect to a website and https will mean my communication is protected. Yes, well maybe. You see it is now a slippery slope. When it finally means that I cannot log on to my bank or government site to conduct business the internet as we know it has ground to a halt. No Facebook, no Instagram, no Gmail, sorry it's all gone?

Installing a Keylogger

Wow, where do you start! This is what the bad guys try and do every day. Why because it captures all your keystrokes. Literally, everything you type including everything I typed into this blog. My username and password for my banking. Logging into Amazon to buy a book or other things. Every keystroke is sent off to the endpoint where the keylogger is sending the data. Two problems, can someone intercept it? Can I access the place this is going? There will be more issues but let's start with these.

Hackers will do their darnedest to implement their own tools on your computer to capture the output of the keylogger. I guarantee that they will succeed.

The endpoint. That is some great repository of all the data from everyone's devices, yes everyone. That is a copy of every keystroke.
First is storage, we create terabytes of data each day globally. That will require some significant tech to store all that data and significant cost to taxpayers. Second, the endpoint will have to have better than world-class developers and security teams to secure this data. Look above at what is being stored everything you need to cause identity crises for much of that country. 
As a Hacker, these would be golden treasure troves. The value of the repository would be immense to the bad guys. The ability to lure people to do any number of things will have zero monetary bounds. You want $10million to do that, sure, hell I would have paid $50million.

A bug in the app 

Those who have reason to will almost immediately find the bug. These might be enemies of your state (your country) and well just about anyone who might want to commit an offence against you personally. It can't be there just for the government spying agency as that isn't how these things work. All sorts of people will either be screaming for a fix for the bug or quietly extracting your data. How will you know if the app has a bug? Publishing details are the only way you will ever know. 

The Australian government has placed penalties of 10 years custody in their proposed legislation. This is to stop good people telling you that your apps developer has developed a bug to allow the government to spy. Even reporting on the fact you found all sorts of data or other relevant things on the dark web would leave you open to prosecution. 

See ya round

Peter

Thursday, July 26, 2018

Anaconda, Jupyter labs, Java and Broken Stuff

The premise

I decided to have a look over Jupyter Labs as part of some professional development. This was to start learning and getting some skills in Jupyter.

I decided to use the Safari education provided through ACM. A seemingly good starting point was Jupyter: Interactive Computing with Jupyter Notebook. This turns out to be a fast-paced set of videos using Jupyter Notebook which is slightly out date with Jupyter Labs. It's a broad brush and introduces a lot of concepts in quick succession.

What did I find

Jupyter Labs is exciting and looks like a great way forward in the Jupyter world. Following along the course I went through a few examples and installed the extra components up to Scala (which is failing on Ubuntu 18). The author mentions a few problems with the maturity of installing on Windows and I'd say there's work to do in the Linux world. Will follow up with a demo of installing all this.

Getting to complete a variety of visualisations in various notebooks including python, R and Julia is a good place to get started. The course due to its breadth is light on specific content in any area. It helps get across a broad understanding of the of Jupyter.

There's also a brief introduction to Jupyter Hubs and the use of Docker with Jupyter.

The problems I ran into

As I mentioned above Scala won't currently install on Ubuntu 18 without a few tweaks. This is due to a bug with cacerts with the fix is well underway and should soon be out(24/07/2018)

JavaScript in labs is broken due to an issue with extensions. This same issue is causing other problems with the use of widgets.
There is also a lot of hit and miss with getting all the relevant packages installed to support all Jupyter Labs functionality

On this Learning Path on Safari

This is a fast-paced introduction, it's skimming just above the surface Jupyter. You will learn a broad swwep of Jupyter. Labs is different so you will have to work a bit out yourself. There's very little to teach you about how to formulate a notebook – rules guidance etc. While interesting the course does provide an overview. There is some opportunity to execute some code but, the fact you need to own the book to access any files makes it less than fantastic. This is not so much a hit at the content, but how Safari must be licensing this Packt content

Thursday, July 12, 2018

Yum failing error 14 no more mirrors

If you encounter [Errno 14] when trying to execute yum commands then you've an out of date cache.

I just booted up a Centos VM it hadn't been used for around 3 months and was trying to install Docker. The VM would throw this and not install items.


[root@dataserver1 ~]# yum install docker
Loaded plugins: fastestmirror, langpacks
Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast
http://mirror.aarnet.edu.au/pub/centos/7.3.1611/os/x86_64/repodata/repomd.xml: [Errno 14] HTTP Error 404 - Not Found
Trying other mirror.
To address this issue please refer to the below knowledge base article

https://access.redhat.com/articles/1320623

If above article doesn't help to resolve this issue please create a bug on https://bugs.centos.org/

http://mirror.internode.on.net/pub/centos/7.3.1611/os/x86_64/repodata/repomd.xml: [Errno 14] HTTP Error 404 - Not Found
Trying other mirror.


It's a simple fix simply execute

yum clean all
yum makecache fast

It will rebuild your repo cache and have proper details for latest repo mirrors


See ya round

Peter

Saturday, July 07, 2018

SQL Server Buffer Cache Hit Ratio is dead

I was recently working with a  client on a problem they'd with an application. The client requested someone have a look over the database and see if I could find anything which the internal staff might have overlooked.

I started browsing around in the monitoring and noticed an anomaly. The database was showing an increase in IO Wait and nothing else. The monitoring tool hadn't alerted, though, in an advisor suggested to review IO Wait. Unfortunately, overlooking these measures was missing the problem. The general health of the database instance, CPU, memory, the number of connections etc were all rather benign. This along with a high BCHR and this lead people to come to the conclusion that the database server wasn't struggling and hence it was ok.

I've seen a quite a few articles in the SQL Server world dissing wait states and claiming BCHR as the tool of measures to DB health and yet here we're the exact thing Method R says pointed to an issue.

There's a great example over at SimpleTalk on how to show why BCHR isn't the greatest of measures. A database might be performing great work with only a BCHR of 65%. Individual circumstances will only tell you if that's right in a system. As the article at SimpleTalk discussed the idea that a system could be exhibiting memory pressure and yet still show a high BCHR.

When you search for Buffer Cache Hit Ratio tuning you'll find an article by Joe Chang and it, unfortunately, muddies the water on what reducing LIO was about. The thing is that Method R is about tuning out wait time and that it's highly applicable to modern SQL Server.

When you look at some of the graphs from the system you have Something that looked like this when the problem was in existence.

 What you can see here is a graph of various waits and the underlying area of IO wait in the blue.
This graph may not look that spectacular, nothing stands out, and that's why it might be easy to overlook. The balance was that the IO Wait was 30% and higher as a percentage of wait time. Other data was showing wait times of seconds. Now looking at the other measures of CPU, 20% to 40% random spikes in usage, nothing there. BCHR always above 90% and mostly above 99%. I then thought I'd overlooked something.

We investigated several potential slow queries identified due to logs of software using this database. Work found the need to fix one index. A few others went on the backlog for future work, but still no performance. I'd to come back to why was it getting such high IO. We'd optimised queries, we didn't have other performance issues, but we'd higher than seemed reasonable IO Wait time.

Next step was to get diskspd and run it on the drive where the data file was for the database. The results are below This server had dual 8GB HBA channels to the SAN. If you look down in the second section the total MB/s throughput is around 48MB/s possible half of a single 8GB channel. Also look at the 99th percentile. For those not quite up on their statistics, here I go. 1 out of 100 queries diskspd made was taking 32ms to read and 1 out 100 16 ms to write. Unlucky you if you got the 100th of both at the same time



Command Line: diskspd -b8K -d30 -o8 -t8 -h -r -w35 -L -Z1G -c50G G:\IOTest\iotestR1T1.dat

Input parameters:

timespan:   1
-------------
duration: 30s
warm up time: 5s
cool down time: 0s
measuring latency
random seed: 0
path: 'G:\IOTest\iotestR1T1.dat'
think time: 0ms
burst size: 0
software cache disabled
hardware write cache disabled, writethrough on
write buffer size: 1073741824
performing mix test (read/write ratio: 65/35)
block size: 8192
using random I/O (alignment: 8192)
number of outstanding I/O operations: 8
thread stride size: 0
threads per file: 8
using I/O Completion Ports
IO priority: normal



Results for timespan 1:
*******************************************************************************

actual test time: 30.00s
thread count: 8
proc count: 8

CPU |  Usage |  User  |  Kernel |  Idle
-------------------------------------------
   0|  19.66%|  13.99%|    5.67%|  80.34%
   1|  40.67%|  11.08%|   29.59%|  59.33%
   2|  19.71%|  11.86%|    7.85%|  80.29%
   3|  21.16%|  15.91%|    5.25%|  78.84%
   4|  25.33%|  17.42%|    7.90%|  74.67%
   5|  20.59%|  13.36%|    7.23%|  79.41%
   6|  22.36%|  14.04%|    8.32%|  77.64%
   7|  25.79%|  13.47%|   12.32%|  74.21%
-------------------------------------------
avg.|  24.41%|  13.89%|   10.52%|  75.59%

Total IO
thread |       bytes     |     I/Os     |    MiB/s   |  I/O per s |  AvgLat  | LatStdDev |  file
-----------------------------------------------------------------------------------------------------
     0 |       191004672 |        23316 |       6.07 |     777.18 |   10.291 |     6.838 | G:\IOTest\iotestR1T1.dat (50GiB)
     1 |       187531264 |        22892 |       5.96 |     763.05 |   10.480 |     8.466 | G:\IOTest\iotestR1T1.dat (50GiB)
     2 |       190963712 |        23311 |       6.07 |     777.01 |   10.293 |     6.858 | G:\IOTest\iotestR1T1.dat (50GiB)
     3 |       188301312 |        22986 |       5.99 |     766.18 |   10.438 |     8.153 | G:\IOTest\iotestR1T1.dat (50GiB)
     4 |       188792832 |        23046 |       6.00 |     768.18 |   10.410 |     7.029 | G:\IOTest\iotestR1T1.dat (50GiB)
     5 |       192118784 |        23452 |       6.11 |     781.71 |   10.232 |     6.327 | G:\IOTest\iotestR1T1.dat (50GiB)
     6 |       191086592 |        23326 |       6.07 |     777.51 |   10.284 |     6.488 | G:\IOTest\iotestR1T1.dat (50GiB)
     7 |       191832064 |        23417 |       6.10 |     780.55 |   10.246 |     6.587 | G:\IOTest\iotestR1T1.dat (50GiB)
-----------------------------------------------------------------------------------------------------
total:        1521631232 |       185746 |      48.37 |    6191.39 |   10.333 |     7.126

Read IO
thread |       bytes     |     I/Os     |    MiB/s   |  I/O per s |  AvgLat  | LatStdDev |  file
-----------------------------------------------------------------------------------------------------
     0 |       124010496 |        15138 |       3.94 |     504.59 |   11.605 |     7.852 | G:\IOTest\iotestR1T1.dat (50GiB)
     1 |       121675776 |        14853 |       3.87 |     495.09 |   11.788 |     9.912 | G:\IOTest\iotestR1T1.dat (50GiB)
     2 |       123576320 |        15085 |       3.93 |     502.82 |   11.627 |     7.880 | G:\IOTest\iotestR1T1.dat (50GiB)
     3 |       122241024 |        14922 |       3.89 |     497.39 |   11.773 |     9.524 | G:\IOTest\iotestR1T1.dat (50GiB)
     4 |       122519552 |        14956 |       3.89 |     498.52 |   11.778 |     8.101 | G:\IOTest\iotestR1T1.dat (50GiB)
     5 |       124936192 |        15251 |       3.97 |     508.35 |   11.538 |     7.227 | G:\IOTest\iotestR1T1.dat (50GiB)
     6 |       123412480 |        15065 |       3.92 |     502.15 |   11.619 |     7.431 | G:\IOTest\iotestR1T1.dat (50GiB)
     7 |       123387904 |        15062 |       3.92 |     502.05 |   11.590 |     7.604 | G:\IOTest\iotestR1T1.dat (50GiB)
-----------------------------------------------------------------------------------------------------
total:         985759744 |       120332 |      31.34 |    4010.97 |   11.664 |     8.237

Write IO
thread |       bytes     |     I/Os     |    MiB/s   |  I/O per s |  AvgLat  | LatStdDev |  file
-----------------------------------------------------------------------------------------------------
     0 |        66994176 |         8178 |       2.13 |     272.59 |    7.860 |     3.170 | G:\IOTest\iotestR1T1.dat (50GiB)
     1 |        65855488 |         8039 |       2.09 |     267.96 |    8.062 |     3.687 | G:\IOTest\iotestR1T1.dat (50GiB)
     2 |        67387392 |         8226 |       2.14 |     274.19 |    7.845 |     3.182 | G:\IOTest\iotestR1T1.dat (50GiB)
     3 |        66060288 |         8064 |       2.10 |     268.79 |    7.968 |     3.497 | G:\IOTest\iotestR1T1.dat (50GiB)
     4 |        66273280 |         8090 |       2.11 |     269.66 |    7.882 |     3.092 | G:\IOTest\iotestR1T1.dat (50GiB)
     5 |        67182592 |         8201 |       2.14 |     273.36 |    7.802 |     2.876 | G:\IOTest\iotestR1T1.dat (50GiB)
     6 |        67674112 |         8261 |       2.15 |     275.36 |    7.850 |     2.998 | G:\IOTest\iotestR1T1.dat (50GiB)
     7 |        68444160 |         8355 |       2.18 |     278.49 |    7.823 |     2.872 | G:\IOTest\iotestR1T1.dat (50GiB)
-----------------------------------------------------------------------------------------------------
total:         535871488 |        65414 |      17.03 |    2180.42 |    7.886 |     3.182



total:
  %-ile |  Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
    min |      0.220 |      1.210 |      0.220
   25th |      5.568 |      5.910 |      5.742
   50th |      9.924 |      7.630 |      8.435
   75th |     16.694 |      9.396 |     13.386
   90th |     21.264 |     10.991 |     19.469
   95th |     24.205 |     12.038 |     22.507
   99th |     32.656 |     16.721 |     30.466
3-nines |     53.280 |     34.684 |     49.527
4-nines |    151.592 |     70.318 |    113.512
5-nines |    530.571 |    110.152 |    530.571
6-nines |    596.914 |    110.152 |    596.914
7-nines |    596.914 |    110.152 |    596.914
8-nines |    596.914 |    110.152 |    596.914
9-nines |    596.914 |    110.152 |    596.914
    max |    596.914 |    110.152 |    596.914

This led to further investigation by other people to find another database which had recently had a change made which showed it running at up to 798Mb/s over its port. These two databases shared a common port on some storage middleware above the disk servers for the SAN.

A couple of takeaways
  1. Assume nothing
  2. Test your theories
  3. Gather evidence
  4. Ask questions
  5. Challenge perceptions
End result the troubling database had its IO access moved to take it away from further interaction with the database which I'd been asked to investigate the performance slowness.
Another thing if you have slowness and you're trying to isolate the source it might take differently worded questions, "how is the database?" might not yield the answer you want. Perhaps try "What is happening in the database when an incident is happening and is that good or bad?"


Good luck with tuning

See ya round

Peter

Sunday, April 08, 2018

What happened at Facebook with Cambridge Analytica

I see a few people asking who don't really understand the technology of data and marketing and what happened with Cambridge Analytica.

Firstly there is the right or legit way Facebook uses and sells data for marketing purposes.

As a general rule Facebook provides aggregated data to advertisers, so if you lived in Oxfordshire in the UK, then Facebook aggregates all data of people who it identifies as living in Oxfordshire. It would then in accordance with general marketing data rules (this is a sample size, not the legal rule)  they would remove some data as it easily identifies a person which risks further identifications of actual people and then bundle up some insights either as a data set or through their own software for marketers. Cambridge Analytica is different in two ways.

First, they directly engaged with US peoples accounts on facebook using an app. I won't call out any, however, you can be reasonably assured any app on Facebook isn't there for your enjoyment, it is to get at your data. Cambridge Analytica simply created an app, had people engage with the app. It might have been guessing my age or funny captions for a photo, but it reeled in a significant number of people. When you connected to that app it would have asked for certain permissions. It now officially had access to whatever data you agreed to share. Here is where the game changed. Cambridge Analytica had realised that due to a bug that they could access the data not only what you had provided for, but additional data and just about as much from your friends. Now as many of us who don't live in the US have friends in the US. Cambridge Analytica did not discriminate, they simply sucked down anything within their ability to access and took all that data into a pond of their own making.


A couple of things, unless you sent credit card details via a messenger post to someone there is little risk of credit card data being there. If your full home address is there, then you are at risk of having that in the pond of data Cambridge Analytica gathered.
Why did they go to that trouble and what was the outcome.
Without providing a lesson in statistics, there are plenty of them freely available on the web, the whole thing was to identify what your associations are and therefore profile you, particularly if you reside in the US, I am betting they did it for everyone with varying degrees of success if they have your data. Why??
If you liked a post about a lot of environmental issues it means you might not like people taking down environmental protection rules. How do I convince you that the person who is planning taking down the environmental laws is the good guy I make sure that I provide you information that shows why the current rules favour someone who is not you(you are missing out). This is the sneaky way in which they targeted people with ads or fake news items which were to alter peoples perception. If you can create cognitive dissonance with someone you have a reasonable chance of changing their mind.
This has been an ongoing trend in businesses that a customers data is more important than the customer as once they have it they have it potentially for life.

GDPR will put pressure on operators in Europe and the rest of the world need to follow the European standard. Countries which don't have data provisions similar to GDPR need to start moving or expect to be annihilated at the next elections for failing to protect their citizens from such unruly behaviour and such a disregard for users data.

To recap there was a breach in the way the facebook was being accessed through the provided interface. This allowed Cambridge Analytica to access far more data than they ever should have been allowed to.
This was a major failure in Facebook engineering and their privacy and security practices.
This data then appears to have been misused to target ads to people. It is quite possible someone you sit next to at work was getting a very different political message than you and your friends with a similar group being other cohorts from your workplace and possibly user group you share membership of.



Facebook is in hot water in a variety of countries right now including the hearing in Congress next week.

See ya round

Peer

Wednesday, January 03, 2018

Azure Internal Load Balancer Configuration - SQL Server Failover Cluster Instance on Azure Virtual Machines

I have just finished working with a colleague, resolving some issues with creating an IAAS SQL Server cluster in Azure. It took some trial and error and there are some real gaps in information, hopefully, this will help to fill one of them.

Let me first start by saying due to a lack of free diagnostics within Azure, you will need access to insights. This isn't something you would do with an MSDN subscription and might baulk at with your own paid for one.  Why not with MSDN, because with MSDN you can only use a minimal amount of Network Watcher resource and you require network watchers to do any diagnostics at the network level. I hope you have a friendly boss who will give you a space to do some learning and develop these needed Azure skills. alternatively,  or you have a bit of budget to do your own subscription and pay for resources. Make sure you have budget alerts to not blow out your costs and delete the Network watchers as soon as you have a working system.

We had to investigate the linkage between the Azure load balancer and the SQL Server Cluster and Network addresses on the SQL Server cluster. To see what was happening we needed App Insights and Network watchers.

What was the scenario

We had a SQL Server cluster without yet enabling Always on Services. We found that whilst we could connect to the node from which SQL Server was running we couldn't get a connection from any other server or system in our subnet, the problem was hidden as you cant do much to find out why what you are seeing is the problem and where it is.
We installed Wireshark, yes it is your best friend here and yes it is telling the truth even though it seems stupid it isn't seeing what you expected. We couldn't see any traffic when we tried the connecting to the cluster. We got the outbound packet to initiate a connection and then "crickets", nothing responding from the cluster or the load balancer.

Let's go back a step, one of the items in your list of tasks when doing this is to create the SQL cluster and then an internal load balancer.

Let me tell you the instructions on how to configure the internal load balancer in all the Microsoft documents and any blog posts I came across were terrible in the detail.

The load balancer requires Health Probe and that requires an active port on your individual nodes to validate their availability.

People have listed ports around 50k things like 52486 or 62159. Here is the missing bit it has to be an active service running on your server and it can't be anything to do with SQL Server, they are bound to the cluster and you are not able to access them via the individual node IP address

How do you work out a port, two things if you have a reason or don't overly care, you can install IIS and you will have port 80, however, try netstat -an and have a look, you will get entries like this
netstat -an listening ports
netstat - listening ports

 Proto  Local Address          Foreign Address        State
TCP    0.0.0.0:445            0.0.0.0:0              LISTENING
Use one of these ports for your backend service that is the ones near the start of the list and with the Local Address starting with 0.0.0.0, these are listening on all IP addresses on your system.

Once you configure the load balancer with this in the above section you will find your cluster SQL Server will become available to your other services in your Azure space. You will now be able to use SQL Server Management Studio to connect to the active node in the cluster on the cluster IP address


 


Thursday, December 14, 2017

Brisbane Yow 2017 Review


Last week I attended Yow for my first time. It provided some great talks in my broader interests of data and analytics. That is the space I currently consult in. Thanks to my employer Readify for the tickets and the opportunity to attend. We have twenty professional development days a year and that was two taken.
BCEC

Day 1

Day one started with a great keynote from Dr Denis Bauer of the CSIRO talking about the challenges of working with big data in the form of the human genome. Denis was joined by Lynn Langit and they worked through explaining the project and who it was fitted together. If you didn't know the genome sequence in data form require a database table with 3 billion columns, yes that is 3 with a B. Currently there is no relational database wi th the capability to create a table that wide (Pity the poor designer or DBA required to model that one). Of course, this is a big data problem of an order of magnitude significantly large. Denis spoke about how she and the team had set up workloads in AWS to process genomic datasets to deliver real opportunities to identify relationships between peoples genomes to find markers for genetic conditions. Thereare many challenges and great opportunities. I was lucky enough to get some time to have a chat to Denis later in the day and whilst it is exciting there are some real issues to deal with such as misuse and abuse from a variety of parts of society. I enjoyed the talk and then the conversation later. Fascinating women with a fantastic mind.
Image result for human genome creative commons

I then attended a talk about problems with Agile delivered by Jeff Paton. As Readify where I currently work has a very agile approach to the way we work, I was interested to hear about the supposed problem and remediation. Jeff makes very effective use of a style akin to the old writing on slides using an overhead projector. It was engaging, I learned a few things about the place of the product owner and how we as participants in the Agile community by its use can help our product owners be better.

Next in my day was AWS Security by Aaron Bedra. Aaron made many good points about securing the cloud and its services and the fact which I wholeheartedly agree with is that cloud done right is more than likely much more secure than many data centres. I learned a few things and was reminded to check some work for a current client I am working with.
 Getting up a system in the cloud can be very fast compared to a traditional data centre, however, with that comes a number of risks. Aaron spoke about the checklist and things you can do to make sure your approach to security is sound.

Next on my day was Jim Webber, as a DBA I am always interested in database technology. As neo4j jas a strong market presence and now SQL Server includes a graph database this was an opportunity to learn more. I had a few items of basic knowledge reinforced and then Jim went on to talk about consistency in large-scale databases and what they had changed to handle this. The use of Causal consistency and a causal clustering architecture. deliver better throughput, large-scale clustering and a method to maintain the integrity of data in the database. Totally enjoyed expanding my knowledge of graph databases.

The day was progressing and next up was Chanuki Illushka Seresinhe. Chanuki spoke about what about beauty makes us happy and was it possible to quantify with deep learning. This was interesting to learn more about some of the concepts of deep learning. Chanulki also spoke about the fact there are limited large datasets in some domains to do testing in other regions and also other domains. Even with the large dataset, she had access to from Scenic or Not there were large gaps in information which made the dataset less than ideal. This potentially causes all sorts of biases one of the very real problems with computer AI

Next, my afternoon continued with more computer learning and two great talks on Machine Learning
First up was Julie Pitt. Julie spoke about the issue with training AI, biases and problems with algorithms. The key piece of her talk was about framing the AI problem right and discussing why it is way past where we thought we would have robots in our homes and yet the problems which stop them happening are still present. Julie is reframing the problem to have self-learning robots who adapt to ever-changing environmental situations. Her work is looking at simple problems like making sure that the robot won't assume shortest path is correct ie jumping from the second floor to the patio is the quickest way. As a kid, I grew up reading sci-fi books and Asimov. The facts that some of these problems have been well understood since then means we have work to do. Julie went on the show how having the concept of a zone where the robot survives and part of its job is to learn and maintain it's survival was a really interesting concept to unpack. She spoke about biases and wrongful outcomes. I was lucky enough to speak to Julie after at the networking drinks about some of her presentation and she is a wonderfully engaging person to speak with about her discipline. Oh and apparently I might have to learn Scala

My other presentation I attended on the day on machine learning was Jennifer Marsmen. Jennifer took us through a journey of capturing data in a novel way with the EPOC EEG Headset and analysing the data from it to deduce if we could use brainwave patterns to identify lies. Jennifer was engaging and spoke with great humour to convey her message. One of the key problems which I frequently encounter across data work in all disciplines is data quality. The headset needed to be set up correctly on the volunteer to obtain consistent quality readings to be able to verify the data. Once again I was able to have some time speaking to Jennifer about her data research and the ML capabilities. She spoke about the use of Azure ML and give a few very quick insights into understanding the ML algorithms available and methods of training in Azure ML or any ML system.

We then wrapped up the talks of the day with Dave Farley talking about Software Engineering and if the term is right to describe what developers do. Dave spoke at length discussing terms of skills in other disciplines of engineering.  Should software engineers experiment, Dave said yes and explained that it is a frequent part of civil engineering, for example using models to wind tunnel test the design of a highrise is a form of experimentation and is done to minimise risk and to manage eventual building costs. Dave went on to talk about where software development is at in terms of levels of where other industries are at. He then talked about defining what engineering is and isn't and how work we do is in fact able to be a discipline of engineering. we just have to get some things right and we are not doing that now.

Networking drinks and hors-d'oeuvres ended the day, I caught up with a few speakers notable Jennifer and Julie as a data person and what they were doing was of great interest. I also spoke to Denis this evening. Spoilt to have some time talking with these women.

Day 2

The second day opened with a bang Linda Liukas opened to tell us about Hello Ruby and teaching young children about computers and computing concepts. The Hello Ruby Books are really an amazing creation and what Linda has done is fantastic. Concepts talked about include learning how a loop feels and Ruby's favourite loop I will let you buy the books to find out. If you have young kids around or if you just want to have a fun learning about computers in a non-threatening way these books are for you. Linda is fascinating to speak to one on one, we talked about adding the Hello Ruby books and activities to local daycare activities. I am certainly adding them to my library. Possibly my favourite speaker and talk of Yow

The second stop of the day the blue room and Sara Chipps, the question do you believe an 8yr old girl can programme in C++? Let's talk about Jewelbots. Sara has designed and developed an Arduino based bracelet for girls. They are a rather simple looking device but as an Arduino, device packs a punch, not so much in what they can do but in what they are delivering. Due to the simple design and compact space, the Jewelbot couldn't house a compiler for higher level languages. Instead, the owner when she want's to program, programs in C++ and then bootstraps the device with her new code. A young woman and yes 8 yrs old did some live coding to configure a device. She was a champion, dealt with technical issues with grace and charm. Her parents should be proud and her school as well.

Third stop and off to hear about Dynamic Reteaming from Heidi Helfand. This was a really interesting talk on handling the problems from building and reconfiguring team, no team stays the same. No matter how long it has been together the whole team will change at some time, someone leaves or is promoted. Hiedi provided a lot of great examples and her experiences of reteaming and some ideas how to make it work, even choose your own team. Some great insights into human dynamics and teams.

After lunch another keynote with Gregor Hohpe. He talked about Enterprise Architecture, discussed a number of problems and some solutions. As an EA he ripped it into those who sit in ivory towers and provide colourful diagrams which are often thought of as meaningless in the world of day to day operations and project teams. He then talked about various patterns in Architecture and I went straight out the next day to review a few things in light of his comments. I have been working in a Solution Architect role amongst other titles on my current project. I enjoyed what he was talking about as it fits with a lot of what I  think about the EA role, probably comes from using PEAF as my preferred Architecture methodology/framework.

 Next up I listened to Katrina Owen talk about her accidental open source project and all the problems when you become a maintainer. Katrina is the maintainer of exercism.io a coding education site she created out of a need to make it easier to test and challenge students she was teaching in a coding program. Much of what tore her up in trying to fix problems as a maintainer were people issues, dealing with competing priorities, maintaining balance and sorting things out to do just enough to avoid burnout which she didn't for a period. One of Katrina's lessons, "What are you not going to do today?" That is something we all need to learn. Other things include who or what are you doing your thing for, who matters because otherwise, everyone's opinion is right. Another great talk


Unfortunately, that is where my Yow day ended with speakers. I had to attend a conference call which went way too long, however, it served a purpose in my project and was needed to get some things rolling.
I did get to finish up the day with a beer and network with a whole lot of people. It was here where I was able to catch up with Linda amongst others of the speakers and a number of other attendees

Overall I had a great experience, caught up with a few old associates, made some new fledgeling connections and was able to get some time networking with great speakers. Jump over to the Yow site if any of the authors interest you the slides of the talks are up and videos to come. Yow has links back to websites, Linkedin and Twitter for the speakers

Let's see who is coming to Yow next year as to whether I decide to attend, I am sure there will be some great speakers, so its hurry up and wait until they are announced.