<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Man Behind The System]]></title><description><![CDATA[As in car it's not about the car itself, it's about man behind the wheel. That's the same philosophy I want to introduce using this blog, it is always about Man]]></description><link>https://mbts.dev</link><generator>RSS for Node</generator><lastBuildDate>Sun, 19 Apr 2026 20:42:48 GMT</lastBuildDate><atom:link href="https://mbts.dev/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Why Wildcard Searches in Redis Can Risk Your Job]]></title><description><![CDATA[We were happily integrating Redis into some parts of our decently-sized app until one day it broke. We mistakenly used a seemingly innocent command in our most frequently hit API endpoints, our core business-logic, which cost us a lot of potential sa...]]></description><link>https://mbts.dev/why-wildcard-searches-in-redis-can-risk-your-job</link><guid isPermaLink="true">https://mbts.dev/why-wildcard-searches-in-redis-can-risk-your-job</guid><category><![CDATA[Redis]]></category><category><![CDATA[Mistakes to Avoid]]></category><dc:creator><![CDATA[Iqbal Maulana]]></dc:creator><pubDate>Sun, 13 Apr 2025 16:21:34 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/sxQz2VfoFBE/upload/b121f937f80f7d441c3666824eb0e38d.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We were happily integrating Redis into some parts of our decently-sized app until one day it broke. We mistakenly used a seemingly innocent command in our most frequently hit API endpoints, our core business-logic, which cost us a lot of potential sales from our users.</p>
<p>Luckily, no one got fired from this incident. If it had ended differently, I would have been the first to go, LOL. In this post, I'll share our experience so you can AVOID making the same mistake we did.</p>
<h1 id="heading-everybody-loves-speed">Everybody Loves Speed</h1>
<p>Our customers loved it, our clients demanded it, our PMs were really happy about it, and we engineers could flex our double-digit milliseconds response time on LinkedIn.</p>
<p>That's why we love Redis. It allows us to achieve speed with a very simple setup, and apparently, most developers agree. According to this <a target="_blank" href="https://survey.stackoverflow.co/2024/technology#most-popular-technologies-database-prof">Stack Overflow Developer Survey 2024</a>, Redis is still among the top 10 most popular databases among professionals.</p>
<p>We used it in the most critical part of an e-commerce app we built, which is to cache product data. To be more specific, our product data is not much different from other e-commerce apps; it's the pricing that makes it unique. In short, each customer can see different prices based on their region. It's similar to how Steam sets game prices differently depending on where on earth you live.</p>
<p>When I wrote this article, our user base was about 15,000 users. Each user could have either shared regional pricing or unique pricing just for them. This business requirement results in millions of records for product pricing. Searching through these records every time users browse products is slow, and there's only so much that database indexing can do. Hence, we cache them.</p>
<p>The second unique aspect of our app is that some products have more volatile pricing than others, and our e-commerce app is not the only sales channel. Therefore, we use a third-party ERP to store the most up-to-date pricing data for each product for each customer. It also acts as an aggregator for all orders from different sales channels.</p>
<p>So, we need to check the current price for each product right when a customer clicks the checkout button. After getting the most up-to-date, and possibly different, price, we must invalidated the cache and update it with the new pricing for use on pages other than the checkout page.</p>
<p>Okay, I hope this brings us to the same understanding of what we're dealing with. Now, let's review some Redis internals to further explain why it is incredibly fast and very suitable for our use case.</p>
<h1 id="heading-redis-speed-how">Redis == Speed, How?</h1>
<p>According to <a target="_blank" href="https://blog.bytebytego.com/p/why-is-redis-so-fast">ByteByteGo</a>, Redis is incredibly fast because it is a RAM-based database, which is 1000x faster than accessing data from a disk. As a developer, I like to think of Redis as similar to the in-memory array or list we use in our programs to store data. It is indeed much faster compared to writing data to a .csv file, for example.</p>
<p>The next reason Redis is fast is that it's single-threaded! I know it might seem counterintuitive because we usually speed up apps by splitting tasks so multiple computations can run in parallel through multi-threading.</p>
<p>However, consider this: running tasks in parallel with multi-threading or multi-processing requires careful management of locks or mutexes. Different processes can write to the same data, one process might read data while others are changing it, and let's not forget about deadlocks. I recently gave a talk about this locking topic, specifically for Postgres. You can check it out <a target="_blank" href="https://www.youtube.com/watch?v=RFkytUdY-yU">here</a> to get a sense of how complex multi-threaded apps are and how they require more advanced concurrency control or locking.</p>
<p>By using a single-threaded approach, Redis doesn't need a locking mechanism, which speeds up data queries and mutations because it doesn't have to check what other processes are doing. <strong>Instead, it serializes every incoming command and executes it one at a time.</strong></p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">As for how Redis can handle multiple connections at once while being single-threaded, it uses a concept called I/O Multiplexing and Non-blocking I/O. However, the details of this are beyond the scope of this article.</div>
</div>

<h1 id="heading-watch-out-for-contention">Watch Out for Contention</h1>
<p>By being single-threaded, each command is executed one at a time, or atomically, meaning the main event loop is blocked for as long as it takes for that command to finish. This is the part we often overlook.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://www.youtube.com/watch?v=Suugn-p5C1M">https://www.youtube.com/watch?v=Suugn-p5C1M</a></div>
<p> </p>
<p>Contention in Redis is similar to traffic jams in the videos mentioned above. A slight slowdown of one car creates a domino effect, slowing down other cars, and eventually, you get stuck in traffic.</p>
<p>Think of the circular road in the video as the Redis event loop, with each individual car representing a command that needs to be executed one by one. A single command that takes slightly longer to complete can block other commands, while new commands keep coming, creating contention, or Redis traffic, if you prefer that term.</p>
<p>That's exactly what we're experiencing. Remember our e-commerce checkout workflow? We need to update the product pricing every time a customer makes an orders, and currently, we average 3,000 orders daily.</p>
<p>We cache the product and its price using this Redis key format: <code>product/:productID/:customerID</code>. To keep the pricing data in the cache up-to-date, we need to invalidate all caches for that <code>productID</code> across all <code>customerID</code>s. However, we don't know which customers already have the cache filled, so we use a wildcard search to find all matching keys with this pattern, <code>product/:productID/*</code>, just to invalidate them, 3,000 times a day.</p>
<p>While this perfectly okay with smaller datasets, as our cache datasets grow, the same command starts to slow down, burning the whole app down.</p>
<p>You might wonder, isn't this problem only affecting the checkout page? I must be exaggerating by saying this small mistake is bringing our app down. Well, the checkout page isn't the only place we use Redis. We also use it as a rate limiter because our earlier infrastructure didn't handle this out of the box. Remember that Redis executes every command one at a time? This means a simple <a target="_blank" href="https://redis.io/docs/latest/commands/set/">SET</a>, <a target="_blank" href="https://redis.io/docs/latest/commands/get/">GET</a> and <a target="_blank" href="https://redis.io/docs/latest/commands/incr/">INCR</a> command to track how many requests are made by a client IP address also has to wait for that slow wildcard search command. So, this slowdown is literally burning the whole app.</p>
<h1 id="heading-redis-search-benchmark">Redis Search Benchmark</h1>
<p>There are two commands we know of for doing wildcard searches on Redis: <a target="_blank" href="https://redis.io/docs/latest/commands/keys/">KEYS</a> and <a target="_blank" href="https://redis.io/docs/latest/commands/scan/">SCAN</a>. If you click the links for these commands, you'll be taken to the Redis documentation, where it's clear that both commands are marked as @slow by Redis, and KEYS is also marked as @dangerous. I didn't fully understand the importance of this warning before—how slow could it really be, right??</p>
<p>The KEYS command performs a full scan, O(N) complexities, going through every key in your dataset to find those that match your pattern. It blocks the event loop until it has checked <strong>every key</strong>. On the other hand, SCAN does the same task in chunks, blocking the event loop only until a portion of keys is checked, allowing other commands to be executed without waiting for all keys to be visited like KEYS does. Therefore, SCAN is generally recommended for wildcard searches in production. But is it?</p>
<p>An earlier version of our app used the <a target="_blank" href="https://redis.io/docs/latest/commands/keys/">KEYS</a> command for wildcard searches. Later, we refactored it to use <a target="_blank" href="https://redis.io/docs/latest/commands/scan/">SCAN</a>, only to find that it didn't save our app. Below, we will show some benchmark results to support this finding. The specifics of the machine used for the benchmark are not important. We'll present the trend of how Redis wildcard search command latency changes with different dataset sizes.</p>
<p>For this benchmark, we first fill the database with 1,000, 100,000, and 1,000,000 keys. Then, we run both the KEYS and SCAN commands 100 times each and calculate the average to show the trend. You can visit the code for this benchmark here:</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://github.com/IqbalLx/redis-search-benchmark/tree/main">https://github.com/IqbalLx/redis-search-benchmark/tree/main</a></div>
<p> </p>
<p>Here are the results:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1744559911421/e7f090b1-f342-485e-8d69-60617dd06a1b.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1744559923094/c17833d2-d50c-4b84-9f2a-453bdc31e038.png" alt class="image--center mx-auto" /></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1744559943488/23e791ad-aaca-4f63-919c-7eda98e4a9dd.png" alt class="image--center mx-auto" /></p>
<p>To clearly show the trend, here are the summarized results:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1744559993721/e33780bb-0c3b-407a-b2c6-b02e84c790ad.png" alt class="image--center mx-auto" /></p>
<p>We can clearly see that while it's perfectly fine with smaller datasets—1,000 keys are still searched in single-digit milliseconds, and 100,000 keys are mostly searched in under 100 ms—things get unpredictable when we reach 1,000,000 keys, where it exceeds one second.</p>
<p>If a single command takes one second, and we use Redis extensively as a rate limiter for every API endpoint, as well as 30,000 times a day to invalidate cached product data, it's clear why our app burned to the ground.</p>
<h1 id="heading-solution">Solution</h1>
<p>After discovering that Redis wildcard searches slowed down our entire app, we changed our core logic. We discussed with our clients and PMs the need to let go of pricing consistency. This meant that when one customer checkout, the regional pricing cache for the same product would not be updated for other customers. By doing this, we removed the need for wildcard searches in Redis altogether, only invalidating the product cache for the single customer who clicks checkout at that time.</p>
<p>However, we are not completely abandoning Redis wildcard searches. Where absolutely necessary, we use SCAN, which is a safer alternative compared to KEYS. We use it in non-customer-facing applications, such as background jobs that run only once an hour. Compared to the previous production load, the number of chunks needed to retrieve all matching keys is significantly lower.</p>
<h1 id="heading-conclusion">Conclusion</h1>
<p>If you're considering doing a wildcard search in Redis, pause and explore alternatives. Consider changing your logic, data structure, or key pattern in Redis. After this incident, <strong>We generally don't recommend searching through Redis keys unless it's absolutely necessary.</strong> Thank you for reading, and I hope this helps you keep your job!</p>
]]></content:encoded></item><item><title><![CDATA[An Effort to Fight Illegal Online Gambling (Judol) Promotions in Indonesia using AI]]></title><description><![CDATA[Amid all the other nonsensical things happening in Indonesia recently, our YouTube comment section is flooded with illegal online gambling promotions, known as Judol. From creators with hundreds of views to those with millions of subscribers, if you ...]]></description><link>https://mbts.dev/an-effort-to-fight-illegal-online-gambling-judol-promotions-in-indonesia-using-ai</link><guid isPermaLink="true">https://mbts.dev/an-effort-to-fight-illegal-online-gambling-judol-promotions-in-indonesia-using-ai</guid><category><![CDATA[generative ai]]></category><category><![CDATA[Online Gambling]]></category><dc:creator><![CDATA[Iqbal Maulana]]></dc:creator><pubDate>Fri, 21 Mar 2025 03:38:06 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/dmc4sVdnSDs/upload/347af3385307b07b0e342d8f070ed64e.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Amid all the other nonsensical things happening in Indonesia recently, our YouTube comment section is flooded with illegal online gambling promotions, known as Judol. From creators with hundreds of views to those with millions of subscribers, if you watch their videos daily, chances are you'll encounter these Judol comments. What's unique about them is that the posters are very aware of YouTube's moderation system, which would easily block their words. So, they cleverly use fancy Unicode characters like Mathematical Alphanumeric Symbols, Enclosed Alphanumerics, and Cyrillic blocks. Thanks to ChatGPT for helping me discover these characteristics.</p>
<p>By doing this, traditional keyword matching systems struggle to predict how similar these words are when using fancy Unicode compared to plain text keywords. That's exactly what's happening with YouTube's moderation settings. In the creator dashboard, you can only block exact keywords or block the commenters. Unfortunately, after reading some creator feedback, it's clear that manually blocking these words with fancy Unicode one-by-one is very hard and time-consuming. In this article, I bring the big gun, Generative AI or LLM, to help us extract these Judol keywords and potentially help creators save time blocking these keywords, reducing the space for Judol promotions across YouTube.</p>
<h1 id="heading-dear-youtube-creators">Dear YouTube Creators</h1>
<p>You can visit the website directly at <a target="_blank" href="https://judol-watchdog.mbts.dev">https://judol-watchdog.mbts.dev</a>. There, you can see the Judol keywords I gathered by sampling comments from the <a target="_blank" href="https://www.youtube.com/@williamjakfar">@WilliamJakfar</a> channel.</p>
<p>As instructed on the website, you can copy all these keywords and paste them into the <strong>Blocked Keywords</strong> section under Settings &gt; Community Moderation.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1742525826708/df0438d9-8115-428b-b45d-4fd0c4eab209.png" alt class="image--center mx-auto" /></p>
<p>Additionally, you can also block users who promote those Judol websites in the same settings page, under the <strong>Hidden Users</strong> section.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1742526027135/20ec6a2f-a15e-4e92-b90a-24be1019e423.png" alt class="image--center mx-auto" /></p>
<p>By doing this, we can limit the keywords previously used to promote Judol. Although, according to my calculations, there are over 1 billion possible combinations to construct a Judol website using the Fancy Unicode I mentioned earlier, I still hope that this small action, combined with many creators joining forces to block all these known keywords from their channels, can reduce Judol promotions across YouTube.</p>
<h1 id="heading-dear-engineers">Dear Engineers</h1>
<p>I hope you're curious about how I built this website from the ground up. If so, let’s break it down. Or jump to code directly on my GitHub below</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://github.com/IqbalLx/judol-watchdog/">https://github.com/IqbalLx/judol-watchdog/</a></div>
<p> </p>
<p>Every project I create needs a purpose. Besides being annoyed by these Judol comments, I wanted to learn a new tech stack. In this project, I wanted hands-on experience with <a target="_blank" href="https://bun.sh/">Bun</a>. I aimed to use Bun as natively as possible, testing their claim that Bun is battery-included, with no need for third-party HTTP servers, HTTP clients, or SQL connectors. And that's exactly what I did—I used all of Bun's native functionality. For the client side, I didn't want to use React, so I chose HTMX. I then dockerized everything and deployed it to <a target="_blank" href="https://fly.io">Fly.io</a>.</p>
<p>Now for the LLM parts, as a broke college student, I used Meta LLama 3 70B hosted on <a target="_blank" href="https://groq.com/">Groq</a> for affordable pricing. To make it even cheaper, I used their <a target="_blank" href="https://console.groq.com/docs/batch">batch-processing</a>, which offers a 25% discount. Below is the general flow I use.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1742527412838/d93b34eb-3b3a-40b6-b455-0db0f247aab3.png" alt class="image--center mx-auto" /></p>
<p>During development, I found that smaller models often produce inaccurate answers when extracting Judol keywords, so I chose to use Meta LLama 3 70B, which is said to be comparable to GPT-4 for some metrics.</p>
<p>I also noticed that when processing more than 50 comments per request, LLama tends to repeat words and hallucinate. Therefore, I limit each request to 50 comments. Below, you can find the System Prompt I use, with the temperature set to 1 and max tokens set to 1024.</p>
<pre><code class="lang-plaintext">You are an assistant to help reduce illegal online-gamble promotion in youtube comments.
You will be provided an array of youtube comments inside &lt;comment&gt; tag.
You need to extract exact word from given comments that are highly possible to be the online-gambling name.
Do not hallucinate, only response with text within provided comments.

Examples:

&lt;comment&gt;
Buat yang belum coba, kalian harus coba sekarang juga di 𝘼𝘌𝐑𝘖𝟴𝟪!
Gacir banget tiap main di АHMA𝘿𝑇O𝙏𝐎,nggak pernah bikin kecewa!
Nggak salah pilih main di 𝐴𝐆U𝐒𝑇O𝘛О,rezekinya ngalir terus. Top banget!
Gak ada yang tau kapan rezeki datang, tapi di A𝐆𝑈𝑆T𝐎𝘛О,semuanya bisa terjadi!
Hasil gacir bikin aku makin puas main di 𝐀𝙀𝙍𝙊𝟴𝟾,makasih banyak!
АGU𝑆𝑇𝑂𝑇Omenawarkan berbagai fitur yang menarik bagi sebagian pemain.
Main bentar langsung gacir. Rezeki nggak bisa diprediksi di D𝑂𝙍A7𝟩!
𝐦𝐚𝐢𝐧 𝐝𝐢 sini 𝐠𝐚𝐜𝐨𝐫 𝐡𝐚𝐛𝐢𝐬 𝐛𝐚𝐫𝐮 𝐬𝐚𝐣𝐚 𝐦𝐚𝐢𝐧 𝐬𝐮𝐝𝐚𝐡 𝐝𝐢 𝐤𝐚𝐬𝐢 𝐦𝐚𝐱𝐰𝐢𝐧 𝐢 𝐥𝐨𝐯𝐞 𝐲𝐨𝐮 sawer4d 𝐞𝐦𝐦𝐦𝐦𝐮𝐚𝐚𝐚𝐡𝐡.
&lt;comment/&gt;

𝘼𝘌𝐑𝘖𝟴𝟪,АHMA𝘿𝑇O𝙏𝐎,𝐴𝐆U𝐒𝑇O𝘛О,A𝐆𝑈𝑆T𝐎𝘛О,𝐀𝙀𝙍𝙊𝟴𝟾,АGU𝑆𝑇𝑂𝑇O,D𝑂𝙍A7𝟩,sawer4d
</code></pre>
<h1 id="heading-closing-statement">Closing Statement</h1>
<p>I hope this article provides valuable insights for both YouTube creators and software engineers. Please give it a try and share your feedback. Especially for YouTube creators, if you notice more comments on your channel that aren't listed on the website, you can email me at iqbal@mbts.dev to have your channel automatically scanned by this app as well. Thank you!</p>
]]></content:encoded></item><item><title><![CDATA[From Busy to Productive: Reducing Low-Leverage Work in Software Development]]></title><description><![CDATA[In today’s fast-paced software engineering world, many developers mistakenly believe that being busy means being productive. This happened to me early in my career when I tried so hard, saying "yes" to everything and doing anything just to appear bus...]]></description><link>https://mbts.dev/from-busy-to-productive-reducing-low-leverage-work-in-software-development</link><guid isPermaLink="true">https://mbts.dev/from-busy-to-productive-reducing-low-leverage-work-in-software-development</guid><category><![CDATA[Productivity]]></category><category><![CDATA[effective engineer]]></category><dc:creator><![CDATA[Iqbal Maulana]]></dc:creator><pubDate>Mon, 07 Oct 2024 10:34:03 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/45sjAjSjArQ/upload/01a540bfd20b3b19074b5b6e558bda83.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In today’s fast-paced software engineering world, many developers mistakenly believe that being busy means being productive. This happened to me early in my career when I tried so hard, saying "yes" to everything and doing anything just to appear busy. However, as Edmond Lau explains in <em>The Effective Engineer</em>, true effectiveness is not about how much time you work but about the <strong>impact</strong> of your work.</p>
<p>Engineers often get stuck with low-leverage tasks—activities that take a lot of time but offer little value. This post explores how to spot and avoid these tasks, allowing you to concentrate on high-leverage activities that maximize your contribution and boost your efficiency.</p>
<h2 id="heading-what-is-low-leverage-work">What is Low-Leverage Work?</h2>
<p>Low-leverage work involves tasks that consume a lot of time but provide little value or impact. These activities often include attending unnecessary meetings, dealing with minor bug fixes, or manually doing tasks that could be automated.</p>
<p>While sometimes necessary, these activities usually offer minimal long-term benefits. According to <em>The Effective Engineer</em>, effective engineers focus on work that delivers the most value for the time spent, so it's crucial to identify and reduce low-leverage tasks whenever possible.</p>
<p>Engineers often get caught up in low-leverage work due to cultural or organizational pressures. A common reason is the habit of "looking busy," which many workplaces unintentionally promote. Engineers might feel compelled to attend every meeting or respond to all emails to appear productive, even if these tasks don't lead to meaningful progress. Additionally, the fear of missing out (FOMO) can drive engineers to participate in decisions or discussions where their input isn't needed. Without proper prioritization, engineers tend to handle easier, low-leverage tasks that offer quick wins but little impact.</p>
<h2 id="heading-lets-reflect">Let’s Reflect</h2>
<p>When I first started my career in the software industry, several low-leverage activities took up most of my time, such as:</p>
<ul>
<li><p>Attending business direction meetings during my internship</p>
</li>
<li><p>Working hard to build an Excel parser that could handle even rare, unusual layout formats</p>
</li>
<li><p>Manually testing applications</p>
</li>
<li><p>Switching tasks every 5 minutes to check production data and errors</p>
</li>
<li><p>Switching tasks to review code</p>
</li>
<li><p>Debugging randomly</p>
</li>
</ul>
<p>If you wondering why those activities classified as low-leverage activities and what should I do instead, let me break it down for you.</p>
<p><strong>👨‍💻 Attending business direction meetings during my internship</strong></p>
<p>During my internship, I joined a small startup as an AI Engineer. As an intern, your role is to support the company's product features, not to attend <strong>unnecessary meetings</strong> like business direction meetings. This took up a lot of my time, which I could have used to explore more AI methodologies and models to improve my engineering skills, instead of being sidetracked by business-specific issues.</p>
<p>At the time, I thought it was cool to be in these meetings, feeling "busy" and like I was contributing. But in reality, it did more harm than good. For those of you in a similar position or currently interning, to be an effective engineer, focus more on the engineering side. Code more, make mistakes, and learn as much as you can during your internship.</p>
<p><strong>👨‍🔧 Working hard to build an Excel parser that could handle even rare, unusual layout formats</strong></p>
<p>Even though this seems like a good thing to do, it is not as effective as it sounds. This doesn't just apply to Excel tasks; it applies to any feature you are working on. You should consider what "done" means for the intended features and treat anything outside that scope as something to improve later, or handle it manually if it happens in production, since it rarely occurs anyway. <strong>Stop aiming for perfection</strong>. By doing this, you can speed up the go-live time and quickly get feedback from real user interactions, instead of relying on your assumptions about the feature. Just ensure you have strong testing for those features. The expected input should produce the expected output, which leads us to the next point.</p>
<p><strong>👨‍🔬 Manually testing applications</strong></p>
<p>This is a common pitfall for new software engineers: they don't test their code. It's even worse when using a programming language without a type system 💀💀. Back in the day, I manually tested every feature I created. Every new feature that could potentially break existing ones needed to be re-tested by hand. This involved spinning up the UI instance and pointing the API to the local development server.</p>
<p>To be more effective, we should write test code. At the very least, do unit testing, where you test every pure function in your features—functions with no external dependencies or side effects. If you want to do more, as a backend engineer, you can perform API testing by automatically calling your new API with the intended data payload and verifying that it outputs the correct response. By doing this, you can be more efficient with your time every time you deploy a new feature, and it also boosts your confidence.</p>
<p><strong>🔄 Frequently switching tasks</strong></p>
<p>As my team’s software gains more users, issues start appearing regularly. There are needs to check why certain actions trigger specific behaviors, requests to see if certain data is already in the database, or requests to manually update data. These can sometimes interrupt my main workflow. You can respond to these ad-hoc requests immediately and solve the issues faster, but if you are also tasked with higher-priority work, like building new features, you should prioritize that. To handle this situation more effectively, set aside 1-2 hours in the morning or before ending your workday to batch these small, low-priority tasks and address them all at once. More serious issues can be treated as new tickets for fixes, while less serious ones can be resolved during this time.</p>
<p>This also applies to code review requests. In my current workplace, each of us needs to peer-review each other's Merge Requests. While sometimes there are high-priority MRs for production hotfixes, regular MRs can be grouped together and reviewed at the same time. This saves a lot of time by reducing the need to switch contexts frequently.</p>
<p><strong>🦠 Debugging randomly</strong></p>
<p>I used to do this all the time until recently. This behavior is hard to notice because I felt productive by constantly making quick fixes without really considering the actual cause. This led to hours wasted, only to realize I made a silly mistake in the first place. Instead, to be an effective engineer, you should pause for a moment, think about all the variables that might affect the unintended behavior, and then execute the solution. Usually, the solution is simple and easy to code, but when you debug randomly, you miss it. As Abraham Lincoln famously said, "<strong>Give me six hours to chop down a tree, and I will spend the first four sharpening the ax.</strong>"</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In "The Effective Engineer" by Edmond Lau, we learn about the idea of high-leverage work. I shared examples of how to become a more effective engineer by focusing on high-leverage work and reducing low-leverage tasks. If you have similar experiences and want to share your thoughts, feel free to post them in the comments section below. See you!</p>
]]></content:encoded></item><item><title><![CDATA[AI, Embeddings, and VectorDB: A Simple Guide]]></title><description><![CDATA[Before you think this is another article about recent AI or AI-powered technology with a chat interface, copying ChatGPT for a niche task that people will use a couple of times and then forget, don't worry—this article is not.
This article aims to br...]]></description><link>https://mbts.dev/ai-embeddings-and-vectordb-a-simple-guide</link><guid isPermaLink="true">https://mbts.dev/ai-embeddings-and-vectordb-a-simple-guide</guid><category><![CDATA[Artificial Intelligence]]></category><category><![CDATA[vector database]]></category><dc:creator><![CDATA[Iqbal Maulana]]></dc:creator><pubDate>Sat, 06 Jul 2024 15:55:28 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/jIBMSMs4_kA/upload/fec6851c7341c415b594cd5f34d6ee13.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Before you think this is another article about recent AI or AI-powered technology with a chat interface, copying ChatGPT for a niche task that people will use a couple of times and then forget, don't worry—this article is not.</p>
<p>This article aims to bring zero to one about your knowledge about AI. If you're wondering what the heck AI actually is, what Embeddings and VectorDB are, and how all of these are related to the recent boom in LLM, make sure to read the rest of this article.</p>
<h2 id="heading-what-is-ai">What is AI?</h2>
<p>Computer is rigid. Task that appear simple to human may be hard to replicate using a computer, for example:</p>
<ul>
<li><p>Calculate total order for item that aren’t using integer as stock</p>
</li>
<li><p>Cancelling customer order if they didn’t pay within 24 hour</p>
</li>
<li><p>Responding to human text messages or human voice</p>
</li>
<li><p>Recognizing human face</p>
</li>
<li><p>Search documents using natural language</p>
</li>
<li><p>and more ...</p>
</li>
</ul>
<p>Overly complex tasks like recognizing pictures and responding to humans are hard to code manually. So nowadays, people use AI. AI tries to mimic how the human brain works. The most-used algorithm in recent AI development is called an Artificial Neural Network, which is derived from the neural network cells found in the human brain.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720262017570/28a3abc1-8f7e-4dfe-947b-944cc3b00260.png" alt class="image--center mx-auto" /></p>
<p>Instead of coding each rule manually, we provide many examples of input-output pairs and feed them to an AI algorithm. This process is called <strong>training</strong>. The final result of training is called a <strong>model or AI model</strong>, which represents the relationship between input and output data. The model is then used to generate output for future inputs; this step is called <strong>inference</strong>.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720262085137/5d567034-8353-4f60-bcfc-5deba30b2c8b.png" alt class="image--center mx-auto" /></p>
<p>To understand how AI processes data, we need to delve into vectors and embeddings.</p>
<h2 id="heading-what-is-vector-embeddings">What is Vector? Embeddings?</h2>
<p>The term "vector" can be quite ambiguous, as it has various meanings depending on the context. In physics, vectors describe quantities with both magnitude and direction within a 3D space. In programming, vectors are often synonymous with arrays, while in mathematics, vectors have their own unique definition. There are even vectors in biology, and the list goes on.</p>
<p><strong>For our purposes in machine learning, we need to focus on the mathematical and programming vectors.</strong></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720262211789/d09a2156-bd32-4677-abed-3dd5be84f3f5.png" alt class="image--center mx-auto" /></p>
<p>Hang tight, we gonna need to take a look at a lil bit of math here to better understanding AI</p>
<h3 id="heading-mathematical-vector">Mathematical Vector</h3>
<p>Mathematical vectors were inherited from physics, so they are values with direction and magnitude.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720262433220/9081a05b-b429-455a-9cfb-2bd11261170e.png" alt class="image--center mx-auto" /></p>
<p><code>i</code> and <code>j</code> are 1D (one dimensional) vectors with the same magnitude, but with different directions. We also have the 2D and 3D vectors:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720262483526/72949caf-f5de-44b2-9ddc-e3035a84c7e1.png" alt class="image--center mx-auto" /></p>
<p>The difference between math and physics vectors are while a physics vector is used to represent and analyze real-word physical quantities, a math vector is arbitrary and not necessarily represent physical properties and rules.</p>
<h3 id="heading-simple-vector-that-represent-tabular-data">Simple Vector that Represent Tabular Data</h3>
<p>Let say we have this mock dataset about past movement of house price</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>ID</td><td>Neighborhood</td><td>Size (square meter)</td><td>Bedrooms</td><td>Bathrooms</td><td>Price (IDR)</td></tr>
</thead>
<tbody>
<tr>
<td>1</td><td>Uptown</td><td>600</td><td>3</td><td>2</td><td>500.000.000</td></tr>
<tr>
<td>2</td><td>Suburb</td><td>300</td><td>3</td><td>2</td><td>300.000.000</td></tr>
<tr>
<td>3</td><td>Suburb</td><td>450</td><td>4</td><td>3</td><td>400.000.000</td></tr>
<tr>
<td>4</td><td>Downtown</td><td>300</td><td>2</td><td>1</td><td>420.000.000</td></tr>
<tr>
<td>5</td><td>Uptown</td><td>400</td><td>2</td><td>2</td><td>450.000.000</td></tr>
</tbody>
</table>
</div><p>Just like a chef prepares ingredients before cooking, we can prepare data before feeding it into an AI model by doing <a target="_blank" href="https://builtin.com/articles/feature-engineering">feature engineering</a>. For example, the following dataset is the result after removing unused columns, encoding the neighborhood class, and normalizing all numeric columns.</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td>neighborhood_uptown</td><td>neighborhood_suburb</td><td>neighborhood_downtown</td><td>size</td><td>bedrooms</td><td>bathrooms</td><td>price</td></tr>
</thead>
<tbody>
<tr>
<td>1</td><td>0</td><td>0</td><td>0.667</td><td>0.5</td><td>0.5</td><td>1.00</td></tr>
<tr>
<td>0</td><td>1</td><td>0</td><td>0.333</td><td>0.5</td><td>0.5</td><td>0.00</td></tr>
<tr>
<td>0</td><td>1</td><td>0</td><td>1.000</td><td>1.0</td><td>1.0</td><td>0.50</td></tr>
<tr>
<td>0</td><td>0</td><td>1</td><td>1.000</td><td>0.0</td><td>0.0</td><td>0.25</td></tr>
<tr>
<td>1</td><td>0</td><td>0</td><td>0.533</td><td>0.5</td><td>0.5</td><td>0.75</td></tr>
</tbody>
</table>
</div><p>Based on above processed dataset, we can say for house with ID 1, the vector representation is <code>[1, 0, 0, 0.667, 0.5, 0.5, 1.00]</code> this value than can be inserted into AI model for training purpose. Future new house’s price can be predicted using final AI model following the same feature engineering rule as the train data.</p>
<h3 id="heading-advance-high-dimensional-vector-for-unstructured-data">Advance High-Dimensional Vector for Unstructured Data</h3>
<p>The previous section shows a simple vector representing traditional structured tabular data. However, with recent advancements in technology and the internet, we are generating more and more unstructured data, such as text, images, videos, and audio. By "unstructured," I mean data that cannot be easily placed into a spreadsheet. Sure, you can insert an image into Google Sheets or Excel or DBMS, but can you perform operations on them? Below image taken from <a target="_blank" href="https://www.m-files.com/what-is-structured-data-vs-unstructured-data-3/">here</a></p>
<p><img src="https://www.m-files.com/wp-content/uploads/2022/10/Screen-Shot-2023-06-20-at-3.27.13-PM.png" alt /></p>
<p>To efficiently use this unstructured dataset for any purpose, we can no longer rely on traditional feature engineering to represent the data. Instead, we use neural networks to distinguish between different unstructured data. The internal layer of a neural network, commonly referred to as the "hidden layer," can generate vector representations of the dataset, typically in much higher dimensions. For example, ClosedAI's generated vector embeddings have 1536 dimensions to represent a sentence.</p>
<p>It's impossible to illustrate 3+n dimensional vectors because we live in a 3-dimensional world. In math, a vector dimension is like an aspect, characteristic, or feature of the data. For example, ChatGPT is an NLP model, so its vector embeddings need many dimensions to capture the meaning of numerous words, context, interpretation, sentiment, and more. These are called <strong>high-dimensional vectors</strong>.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">These high-dimensional vectors are also commonly referred to as <strong>embeddings or vector embeddings.</strong></div>
</div>

<p>The process by which an AI model internally processes unstructured data within its hidden layer to generate representations that accurately reflect all the training data is difficult for humans to understand. Unlike traditional tabular data, where scientists or engineers determine the shape of the data used for training, it is still unclear how AI models arrive at their results with more advanced unstructured data. This is why some references may describe AI models as "black boxes." The effort to better understand how AI models work is called <a target="_blank" href="https://www.ibm.com/topics/explainable-ai">Explainable AI</a>.</p>
<h3 id="heading-how-to-retrieve-high-dimensional-vector-that-represent-unstructured-data">How to Retrieve High-Dimensional Vector that Represent Unstructured Data?</h3>
<p>There are many techniques to represent complex unstructured data with vectors. For example, image data can be converted into vectors using pre-trained image classification models like ResNet or VGG-Net. For text data we can use pre-trained word embeddings model like Word2Vec or GloVe.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">📌</div>
<div data-node-type="callout-text">Since text-spesific area is not my focus and have little experience with real-world implementation, I'll focusing giving more detailed example using image-spesific model and technique</div>
</div>

<p>Image classification model like ResNet is trained on large dataset of human-labeled real-world image like <a target="_blank" href="https://www.image-net.org/">ImageNet</a>. After “seeing” so many image, the AI model can produce contextualize vector, representing that complex image.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720264271254/9053e7a9-73f5-47cd-bed0-10e833ea57c8.png" alt class="image--center mx-auto" /></p>
<p>The picture above shows a typical architecture of an AI model used to classify images. A special type of neural network called a <strong>Convolutional Neural Network</strong> is used to extract meaningful features from an image. The final layer, known as the classification layer, consists of regular neurons that represent each target output.</p>
<p>To retrieve the vector embeddings of an image, we can remove the final classification layer from a complete model like ResNet. Now, the model still accepts an image as input, but instead of outputting a target class, it outputs a 512-dimensional vector embeddings.</p>
<h2 id="heading-what-is-vectordb-why-do-we-need-it">What is VectorDB? Why Do We Need It?</h2>
<p>Vector DB is a trending topic in tech since AI has become popular again. As you might guess, a vector DB is a special type of database used to store vector embeddings generated by AI models.</p>
<p>Vector DB is most useful when we need a way to cache data representations generated by an AI model. For example, in a face recognition system, it’s costly to send each registered face to the AI model just to compare it with a new face. Instead, we can save each registered face embeddings to a vector DB and then simply query the data for inference.</p>
<p>Remember that a vector dimension is like an aspect, characteristic, or feature of the data. Similar data should share similar characteristics or features. That's why in a face recognition system, we query similar data to the current data to determine who the new face is based on registered faces.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720279110575/1f810193-9006-4922-9526-7d864106ab68.png" alt class="image--center mx-auto" /></p>
<p>The same principle applies to a technique called <a target="_blank" href="https://cloud.google.com/use-cases/retrieval-augmented-generation?hl=en">RAG or Retrieval-Augmented Generation</a>. This technique reduces hallucination in LLMs by contextualizing user queries based on known data. For each user prompt or question, we search documents or other types of knowledge bases with the highest similarity in characteristics or features to the user's prompt or question. Then, we send this information to the LLM to produce an answer based on the retrieved knowledge.</p>
<p>This technique has been proven to overcome some weaknesses of LLMs, such as dealing with outdated data and answering questions about topics outside their training data.</p>
<h3 id="heading-how-to-query-similar-data">How to Query Similar Data?</h3>
<p>To query vector embeddings we can’t use operations we typically use in conventional database system like <code>=</code> , <code>!=</code> , <code>is</code> , <code>like</code> , etc. Instead we use distance function to measure similarity between each vector.</p>
<p>Several similarity measures can be used, including:</p>
<ul>
<li><p><strong>Cosine similarity:</strong> measures the cosine of the angle between two vectors in a vector space. It ranges from -1 to 1, where 1 represents identical vectors, 0 represents orthogonal vectors, and -1 represents vectors that are diametrically opposed.</p>
</li>
<li><p><strong>Euclidean distance:</strong> measures the straight-line distance between two vectors in a vector space. It ranges from 0 to infinity, where 0 represents identical vectors, and larger values represent increasingly dissimilar vectors.</p>
</li>
</ul>
<p>and more ...</p>
<p>As vector embeddings represent unstructured data, the position of the vector in n-dimensional space can be perceive as how similar that data. For example word Dog and Cat may located nearby in vector space compared to Dog and Lion as Cat and Dog commonly used together in sentence or paragraph to describe pets in a lot of data that used for training, while Lion not as much, probably even none.</p>
<h3 id="heading-disclaimer-on-cosine-similarity">Disclaimer on Cosine Similarity</h3>
<p>For those who aware that not all vector embeddings can be accurately stated as semantically similar by only using cosine distance as stated in this <a target="_blank" href="https://arxiv.org/abs/2403.05440">paper</a>. The paper itself stated</p>
<blockquote>
<p>While there are countless papers that report the successful use of cosine similarity in practical applications, it was, however, also found to not work as well as other approaches, like the (unnormalized) dot-product between the learned embeddings</p>
</blockquote>
<p>For this article example that I will put in next post, I can confirm that cosine similarity works really well to find similar data. But if you want to build other projects, leveraging different AI model, do your own research on which distance function more suitable for the case</p>
<h3 id="heading-choosing-vectordb">Choosing VectorDB</h3>
<p>One of the vectorDB we can use is Postgres (as it also widely-used DB for any apps), with its extension called <code>pgvector</code>, from its documentation (each vector db provider may differ) they support this <a target="_blank" href="https://github.com/pgvector/pgvector#vector-functions">following vector operators and functions</a></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1720279475483/cedd253e-ae9c-46fc-8810-b31a0f844acf.png" alt class="image--center mx-auto" /></p>
<p>Since it is a Postgres extension, I know many managed Postgres services already support it, such as Neon, Supabase, KoyebDB, and even Postgres-compatible services like AlloyDB.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>We have learned about AI, embeddings, and how these embeddings can be saved inside VectorDB to efficiently query similar data in the future. This mechanism allows LLM to reduce hallucinations and use public LLM to respond based on private data. Additionally, other AI-powered apps, like face recognition systems, benefit from VectorDB to efficiently search for a new person's face based on registered faces.</p>
<p>These are a few reasons why people have been talking about VectorDB a lot recently. Some major companies are also investing a lot of money in anyone who can build a more efficient VectorDB.</p>
<p>To learn more about VectorDB by writing some actual code and using VectorDB to build your own AI-powered apps, stay tuned. Consider subscribing to get the next post with practical examples sent directly to your inbox!</p>
]]></content:encoded></item><item><title><![CDATA[Dependency Inversion for AI Engineer]]></title><description><![CDATA[I began my software engineering journey to build many AI-powered apps. Typically, the problems I need to solve are related to images and videos. In the AI world, this job is called Computer Vision, which is essentially about giving computers a pair o...]]></description><link>https://mbts.dev/dependency-inversion-for-ai-engineer</link><guid isPermaLink="true">https://mbts.dev/dependency-inversion-for-ai-engineer</guid><category><![CDATA[Software Engineering]]></category><category><![CDATA[Artificial Intelligence]]></category><dc:creator><![CDATA[Iqbal Maulana]]></dc:creator><pubDate>Fri, 24 May 2024 17:19:41 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1716571434899/c726f6c9-be19-4fc3-83b4-36009da02ad8.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I began my software engineering journey to build many AI-powered apps. Typically, the problems I need to solve are related to images and videos. In the AI world, this job is called Computer Vision, which is essentially about giving computers a pair of eyes.</p>
<p>After that, I transitioned my career into web development, specifically backend engineering, because the environment is very similar to developing AI-powered apps. As a backend engineer, I learned about the S.O.L.I.D principles. The last character stands for Dependency Inversion, a concept that I wish I knew sooner.</p>
<div data-node-type="callout">
<div data-node-type="callout-emoji">💡</div>
<div data-node-type="callout-text">If you want to know what the other characters in SOLID principles mean, I recommend to read this <a target="_blank" href="https://www.digitalocean.com/community/conceptual-articles/s-o-l-i-d-the-first-five-principles-of-object-oriented-design#dependency-inversion-principle">great article at Digital Ocean</a>.</div>
</div>

<div data-node-type="callout">
<div data-node-type="callout-emoji">❗</div>
<div data-node-type="callout-text">Disclaimer: Before you continue reading, let me clarify what I mean by AI Engineer. An AI Engineer is a Software Engineer who mainly builds AI-powered apps. The AI model itself can come from an open-source model or from their internal team. I usually distinguish between AI Engineer and AI Scientist. An AI Scientist focuses more on researching new model architectures and probably doesn't need the concepts I'm discussing in this post.</div>
</div>

<h2 id="heading-what-is-dependency-inversion">What is Dependency Inversion?</h2>
<blockquote>
<p>Entities must depend on abstractions, not on concretions. It states that the high-level module must not depend on the low-level module, but they should depend on abstractions.</p>
</blockquote>
<p>Not gonna lie, I got confused too when reading that line. But it turns out the meaning is quite straightforward. Let me break down the important terms:</p>
<ul>
<li><p><strong>Entities</strong>: This can be read as a service or function in your code.</p>
</li>
<li><p><strong>Abstraction</strong>: This can be read as an Interface if you are familiar with Java or TypeScript (these are programming languages I know that have Interfaces). If you are like me, from the Python world, this is called an <em>Abstract Base Class</em>. Read more about it <a target="_blank" href="https://docs.python.org/3/library/abc.html">here</a>.</p>
<p>  Basically, this abstraction is like a contract, defining what function or method names will be available in the real class. Only the names are defined; an abstract class doesn't have any logic inside its functions/methods.</p>
</li>
<li><p><strong>Concretions</strong>: This just means the actual class, the one written with <code>class</code> syntax and instantiated later in the main program flow, using the <code>new</code> keyword in TypeScript or simply calling <code>ClassName()</code> in Python.</p>
</li>
</ul>
<p>So, if you allow me to rewrite what Dependency Inversion means, it means:</p>
<blockquote>
<p>Your function should not depend on a hard-coded class with its different function/method names, but instead, depend on an abstract that defines the contract of your class. No matter what class you use later, the contract stays the same, and your main program won't need to change at all.</p>
</blockquote>
<p>Okay, maybe a bit of code example will help. I will use Python as it the majority language that AI Engineer use, and dead simple to read</p>
<p>Imagine you need to build a program to optimize images for your company. Initially, because this is urgent due to storage costs increasing every month, they decide to pay one-third of the storage cost to an image optimizer service. You are tasked with integrating this third-party service into your app.</p>
<p>The next thing you do is open a merge request containing the following lines:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> third-party-sdk <span class="hljs-keyword">import</span> optimize 

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">upload_image</span>(<span class="hljs-params">image</span>):</span>
    token = env.token
    response = optimize(image, token=token)

    _, optimized_image = response.parse()[<span class="hljs-number">0</span>]
    <span class="hljs-keyword">return</span> optimized_image
</code></pre>
<p>Okay, that works well. But next month, to save more money, your lead wants to build your own image optimizer since you don't need most of the features offered by the third-party provider anyway. So, you open another merge request.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> local-library <span class="hljs-keyword">import</span> get_shape, resize, optimize, convert_to_webp 

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">upload_image</span>(<span class="hljs-params">image</span>):</span>
    width, height = get_shape(image)

    resize_ratio = <span class="hljs-number">0.7</span>
    image = resize(image, width=width*resize_ratio, height*resize_ratio)
    image = optimize(image)
    image = convert_to_webp(image)

    <span class="hljs-keyword">return</span> image
</code></pre>
<p>That also works well. But it turns out you forgot to handle edge cases where the image is corrupted and has a width and height of 0, causing the <code>resize</code> function to fail. Because this feature is so important, your lead decides to revert the code.</p>
<p>Oh no, your previous commit also contains another feature from a different task, so you can't simply revert it. You need to re-code the third-party implementation.</p>
<p>It was hectic, but it's done now.</p>
<p>But can you do better? Of course, this is where the <strong>Dependency Inversion</strong> principle shines.</p>
<p>Instead of making that function depend on the details of each image optimizer you use, you can define a contract that any optimizer must follow.</p>
<p>This way, the dependencies are inverted. Your main function no longer needs to depend on each image optimizer you plan to use. Instead, the image optimizer itself depends on the contract you created, ensuring compatibility.</p>
<p>Here is an example:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> abc <span class="hljs-keyword">import</span> ABC, abstractmethod

<span class="hljs-keyword">from</span> third-party-sdk <span class="hljs-keyword">import</span> optimize 
<span class="hljs-keyword">from</span> local-library <span class="hljs-keyword">import</span> get_shape, resize, optimize, convert_to_webp 

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ImageOptimizer</span>:</span>
<span class="hljs-meta">    @abstractmethod</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">optimize</span>(<span class="hljs-params">self, image</span>):</span>
        <span class="hljs-keyword">pass</span>

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">SDKOptimizer</span>(<span class="hljs-params">ImageOptimizer</span>):</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self, api_token: str</span>):</span>
        self._api_token = api_token

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">optimize</span>(<span class="hljs-params">self, image</span>):</span>
        response = optimize(image, token=_api_token)
        _, optimized_image = response.parse()[<span class="hljs-number">0</span>]

        <span class="hljs-keyword">return</span> optimized_image

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">LocalOptimizer</span>(<span class="hljs-params">ImageOptimizer</span>):</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self</span>):</span>
        self._resize_ratio = <span class="hljs-number">0.7</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">optimize</span>(<span class="hljs-params">self, image</span>):</span>
        image = resize(
            image, 
            width=width*self._resize_ratio, 
            height*self._resize_ratio,
        )
        image = optimize(image)
        image = convert_to_webp(image)

        <span class="hljs-keyword">return</span> image

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">upload_image</span>(<span class="hljs-params">optimizer: ImageOptimizer, image</span>):</span>
    image = optimizer.optimize(image)
    <span class="hljs-keyword">return</span> image
</code></pre>
<p>Take a moment to read that code fully. Now, no matter how many more optimizer you planned to use, how many will you reuse, your <code>upload_image</code> function didn't need to change.</p>
<p>This hypothetical scenario happens more often than you think when building AI-powered apps. Today, you might use a HAAR Classifier to detect faces in your images, but next week, you might need to switch to a neural network and change the dependencies to MTCNN. You have to rewrite all the hard-coded HAAR-specific code to MTCNN-specific code.</p>
<p>Testing for improvements becomes difficult because moving back and forth between branches is counter-productive. Since your main program has changed, you can no longer benchmark the implementation in the same code base.</p>
<h2 id="heading-how-to-apply-dependency-inversion">How to Apply Dependency Inversion?</h2>
<p>One technique to implement Dependency Inversion is called <strong>Dependency Injection</strong>. Despite having the same acronym, DI, these two terms have different meanings. Dependency Inversion is the principle, while Dependency Injection is the way you apply that principle to your code base.</p>
<p>Dependency Injection means you inject your concrete class later at runtime. Let's use the same code from previous section.</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">ImageOptimizer</span>:</span>
<span class="hljs-meta">    @abstractmethod</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">optimize</span>(<span class="hljs-params">self, image</span>):</span>
        <span class="hljs-keyword">pass</span>

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">SDKOptimizer</span>(<span class="hljs-params">ImageOptimizer</span>):</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">optimize</span>(<span class="hljs-params">self, image</span>):</span>
        <span class="hljs-comment"># reduced for simplicity</span>
        <span class="hljs-keyword">return</span> optimized_image

<span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">LocalOptimizer</span>(<span class="hljs-params">ImageOptimizer</span>):</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">optimize</span>(<span class="hljs-params">self, image</span>):</span>
        <span class="hljs-comment"># reduced for simplicity</span>
        <span class="hljs-keyword">return</span> image

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">upload_image</span>(<span class="hljs-params">optimizer: ImageOptimizer, image</span>):</span>
    image = optimizer.optimizer(image)
    <span class="hljs-keyword">return</span> image

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">main</span>():</span>
    optimizer = LocalOptimizer()
    <span class="hljs-keyword">return</span> upload_image(optimizer)

<span class="hljs-keyword">if</span> __name__ == <span class="hljs-string">'__main__'</span>:
    main()
</code></pre>
<p>That <code>optimizer</code> variable is the only line you need to change when you decide to switch back to a third-party optimizer via their SDK. If you want to automate this instantiation, you can use something like the <a target="_blank" href="https://refactoring.guru/design-patterns/builder">Builder Pattern</a>.</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_optimizer</span>(<span class="hljs-params">mode: Union[<span class="hljs-string">"local"</span>, <span class="hljs-string">"network"</span>]</span>) -&gt; ImageOptimizer
    if mode == "local":</span>
        <span class="hljs-keyword">return</span> LocalOptimizer()

    <span class="hljs-keyword">if</span> mode == <span class="hljs-string">"network"</span>:
        <span class="hljs-keyword">return</span> SDKOptimizer()

    <span class="hljs-keyword">raise</span> ValueError()
</code></pre>
<h2 id="heading-why-di-matters-for-ai-engineer">Why DI Matters for AI Engineer?</h2>
<p>Well, for example, the current state-of-the-art model in any AI task changes frequently. Of course, you want to use newer and more efficient models, right? Or, your company has its own AI Scientist team, and they experiment with many models in production. Each model has the same objective, but their preprocessing and class encoding are slightly different. So, instead of changing your main code frequently to accommodate these unique requirements, why not just apply Dependency Inversion?</p>
<p>One open-source AI project I know that already implements this principle to make it easier for you to switch between dependencies is Deepface.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://github.com/serengil/deepface">https://github.com/serengil/deepface</a></div>
<p> </p>
<p>Deepface makes it easy to switch between face embedding models like Dlib, FaceNet, VGG-Face, and many more. You can adopt their approach to achieve this in your own codebase.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>The benefit of applying this principle to any AI-powered app is significant. For example, if you want to build car detection using YOLOv4 via Darknet and later realize that Ultralytics has YOLOv8, all you need to do is implement the <code>ObjectDetector</code> abstract class using Ultralytics YOLO.</p>
<p>Another scenario is when your app uses a centroid-tracking algorithm during development, but in production, you need a more advanced object tracking algorithm, possibly even a neural network. Don't worry—just implement the <code>ObjectTracker</code> base class without modifying your main code. The list goes on.</p>
<p>Give it a try on your current project and let me know how it improves your workflow!</p>
]]></content:encoded></item></channel></rss>