Intro
This is a real photography taken in Thailand on one of its serine Islands. This image is not DALL·E 3 generated.
AI took us there though...
Hello and welcome back ML, AI and Cyber security experts and leaders. I’ve been away for some time for long holidays and have a lot to share. Before I get into more advanced topics like data strategy and AI related consultation Q&As I want to start with an easy-read. Well 'easy to read' and 'written by Radek' sounds almost like oxymoron but I gave it a try.
Before I start I need to confess, yes - I used a LLM ChatBot lately (passe already ?).
I did not use it to create content though, I used it for a different use case as a different application type.
Myself and my partner used the ChatBot to help us plan our Asia holidays instead.
The ultimate outcome of our experiment was exiting and intriguing: a small, idyllic Thai island, selected with AI's assistance, emerged as the top highlight of our four-week journey through three countries, seas, islands, mountains and cities. I mention this to indicate how successful AI was to win with the rest of the all amazing highlights (it did win in the end).
Is there AI magic?
If you add your humanity brain to the mix.
All right, but was it a simple ‘hey - we go to Asia, where should we go to relax?’ question? No, of course it was not. In fact the answer we received to the above hey-AI-make-magic-for-me type question resulted in a lazy googlable recommendation that neither of us liked.
The AI helped eventually, a lot, but it needed way more ‘attention’ to detail, specificity, and a couple of iterations (Experimenting) as well as working with data (data exploratory phase).
Get the results with the quality you deserve
What was very important was, –drums–:
- data exploratory phase
- feature engineering
- data engineering and normalisation
All the above was needed for the AI to give us the best, most optimised and accurate recommendation. ‘Data and feature engineering’ and ability to define multi-dimensional problem statements was the key to success. Simply knowing and expressing clearly what we want and what we don't want, providing well defined and specific requirements and definition of basics like ‘good’ and ‘bad’ were essential.
The problem Statement
Taking a step back. Why was it difficult to pick the place for us in the first place?
It was simply because each of us had different requirements and preferences. There were overlaps but there are differences too. Simply, it was not that easy to find a place that is perfect for both of us. I am sure many of you have been there too where a team cannot get into agreement on 'what's next' despite everyone in the team has all the best intentions to do the best thing.
But what does 'best' even mean? For us the negotiation process felt like optimising a single ai-model for two different objective functions at the same time. In the cybersecurity world for example the common similar problem is confusion between optimisation for true-positive alerts vs optimisation for i-want-to-see-this–type-of-alerts-for-context. Sounds similar bur can be very different. This kind of trap in fact (the misalignment between stakeholders on what we should optimise for, even without recognising it) can be one of the main factors for the machine learning projects to fail. OK, End of digression.
It all starts with Hi! The raw data
So, here is how we started: Very politely, the same way as we would act towards a stranger, we introduced ourselves (with fake names). We described ourselves a bit in free text (likes and dislikes, temperaments, tolerance for surprises, interests, expectation). It was like providing and ingesting raw data before normalisation. After this we asked for the best islands for us again.
The results were still not good enough. We were offered a broad set of options, similar to what we would get at any ‘tourist broker/office’. They all looked interesting but there was no strong winner. Not good enough.
Feature Engineering
This forced us to be more specific about our features. Instead of describing each of us in free text we decided to normalise it a bit. We explained our needs in a bullet point (columnar) fashion, each of us using the same template and the same set of ordered parameters.
Similarly, with the same normalisation principles, we described what we expect from our destination:
Example features for myself were binary features like:
- access_to_nature
- quite
- access_to sport
- not_well_known_place
and for my partner:
- small
- good_food
- access_to_nature
- sea_side
- sunset_views
Similarly, features describing our personalities were more like level_of_stress_tolerance, comfort_needs, noise_level_tolerance, cleanless_tolerance, budget_size.
Getting more intimate with your data and Data engineering
After defining the dimensions in a more structured fashion, the problem, even in our own brains, starts to feel like we are getting somewhere. We cross-checked with the AI and asked to cluster the initially proposed destinations and match them against the specified features but without recommendations yet.
This exercise allowed us to filter out certain options from the result-set of potential winners. After a simple eyeballing it became clear that some places were too extreme either for me or for my partner. This reduced noise but made the work for AI potentially a bit more difficult as the remaining options were more similar to each other. Either way it increased our intimacy with data and with the problem scope. We just needed to keep in mind that to balance out the impact of removal of some data we will need to offer more features instead.
We resumed and asked the AI to label (classify) the destination options as PARTNER and RADEK or BOTH to indicate which one is a better match for me and which is a better match for her. We asked as well to iterate features that matched for each place.
This was still at the same time kind of ‘data exploratory’ exercise allowing us to get better feeling of features, dimensions, potential result set and signal strengths for different features (i.e. access_to_nature, funny enough, had a very weak signal after we removed some obviously bad candidates). This allowed us to remove some extra features and add others that initially we believed would not matter (i.e. access from airports and access via ferries). In the end the more features with strong signal the better accuracy right?
Classification and recommendation
Once we became satisfied with this feature engineering exercise we asked the Bot again to iterate the winning candidates with the list of features per candidate and to classify each candidate as PARTNER, RADEK or BOTH . The result was a nice free text explanation but too verbose so I asked him to create a columnar table with candidates for each row and features in columns. It produced a very readable table. At this point everything was clear and when finally asked AI to recommend the final decision it matched what we started to learn to ourselves - the Quiet Isle.
Closing remarks
So we went to the Quiet Island (ping me if you want to know it's name) in Thailand which ended up to be our own dream paradise. This is the island where I lost my phone due to mindblowing miracles of nature that could be seen in the morning by the sea from the white sand beaches.
This story to me carries some optimism. It proves that optimising a single model for two different objective functions in a relationship (make radek happy and make the Partner happy) can actually work assuming there are shared features (values), especially those with strong signals.
And yes, the AI tool proved to be very useful, you cannot be lazy and it did not replace us. It is not freeing us (people) from being people. In fact it delivers better results when dealt with clear and well defined self awareness.
Bonus, it is all about communication
The scoring mechanism can be tuned as you like (of course just talk to it!), the ranking can be displayed if you ask for. You can even micromanage the heuristic that AI uses to rank the recommendations if you ask clearly. Playing with it was fun. Give it a go and come back to me with feedback. Thank!