Automated Platform Observability

Summary

Often times there are tasks that need to be done but noone has time for. The goal of this project was to automate important tasks that used to be impossible, using natual language processing. A 3rd party vendor supplies important updates to different people at the bank through a bulletin board.

Solution

The solution I developed starts with data collection, analyses the impact, stores important data in a database, and then sends out important information to the appropriate people.

The data collection is done through the Python Selenium library which accesses the bulletins and scrapes the articles. Some of the articles however, are spreadsheets pointing to other articles. To deal with this, the scraper recursively checks every link until it gets to the actual article.

The impact of the article is determined by Gemini Flash 3.5 according to a strict set of rules tuned with several iterations of prompt engineering. The model is forced to output JSON data which is then stored along with other relevant information on MongoDB. Incremental loading was added to speed up performance and reduce unnecessary compute use.

After the data has been analysed and stored, an automated email is sent out to the appropriate contacts containing the info they need to know as well as the source.