Automated Platform Observability

Challenge

A 3rd party vendor supplies important updates to different people at the bank through a bulletin board. However, people do not have time to go through the board searching for updates. The bulletins are not structured which made this automation impossible before the existence of large language models.

Approach

The solution I developed starts with data collection, analyzes the impact, stores important data in a database, and then sends out important information to the appropriate people. Some of the bulletins, however, are spreadsheets pointing to other bulletins. To deal with this, the scraper recursively checks every link until it gets to the actual article. The impact of the article is determined by Gemini Flash 3.5 according to a strict set of rules tuned with several iterations of prompt engineering. After the data has been analysed and stored, an automated email is sent out to the appropriate contacts containing the info they need to know, as well as the source.

Contribution

  • End-to-end system design
  • Recursive data scraping with Python Selenium library
  • Forced JSON output with Gemini Flash 3.5 to evaluate the impacts of each bulletin
  • Store data in MongoDB and implement incremental loading
  • Send automated email