Skip directly to search

Skip directly to content

 

A Virtual Hackathon Together with Microsoft

 
 

Innovation | Radu Orghidan |
08 July 2020

In May 2020, we held a two-day hackathon with Microsoft. The aim of the hackathon was to take medical forms completed in several languages and translate them for our travel insurance client. The hackathon was to test if an automated translation solution could be a viable replacement at a fraction of the cost.

BUSINESS CONTEXT

We all consider the possibility of getting sick while traveling. Travel insurance companies know this and provide compelling packages that can help us gain peace of mind, but also recover the expenses incurred by a health incident. However, the process can be complicated by several factors, with one of the foremost hurdles being the communication difficulties between travel insurance companies and remote doctors. Upon returning home from traveling abroad, patients have to send the receipts, medical letters, and all other documents provided by the foreign medical institution to the travel insurance company. 

These documents are potentially in different languages, have different formats, and use medical terminology. Needless to say, insurance companies require accurate translations of these. One of our customers, a travel insurance company, challenged us to build a system that can help them deal with these documents. We accepted through a one-day virtual hackathon with the help of our Microsoft partners.

PREPARING FOR THE PROCESS

We started two weeks in advance, requesting samples of documents to be translated and asking all the necessary questions in order to gain a common understanding of our customer’s objectives and current status. The resulting process workflow is depicted further below in Figure 1. It begins with the email being received by our automated document translation service. Attachments are extracted, and an image is produced for each page of the attached documents (usually as PDFs). Each image is processed, and then optical character recognition (OCR) is performed in order to extract the text.

One optional step for help in dealing with documents that have a previously known structure is document layout analysis. The extracted text is translated and sent back by email. Reconstruction can be performed if the translated document needs to have a similar aspect to the original one.

ARCHITECTURE AND TECHNOLOGY STACK

As shown in Figure 1, our approach involves three main Microsoft Azure technologies:

- Logic Apps, which we used for the process flow automation,
- Functions, which are bits of serverless C# code used for pre- and post-processing, for calling the Cognitive Services, and are triggered by Logic Apps along the pipeline,
- Cognitive Services, which offer the respective machine learning (ML) functionalities for reading the document (using OCR) and for the translation.

All data gets stored in Azure Storage.

 Figure 1
Figure 1. The solution architecture.

Including our utilisation of these technologies in this specific case, here is a general overview of them as well:

Azure Logic Apps enable the connection of apps and services, automating workflows without writing code. By using Logic Apps, our dev teams can create business processes and workflows, integrate with SaaS and enterprise applications, and take advantage of the Microsoft Cloud to enhance the integration solutions. 

Azure Functions is a serverless compute service that lets you run small pieces of code triggered by events without having to explicitly provision or manage infrastructure. Functions can be used for integrating systems, working with the Internet of Things (IoT), and building simple APIs and microservices for processing large data volumes. There are several pricing plans, depending on the usage needs. We used Azure Functions to call the suitable Cognitive Services for OCR, for context understanding using Read API, and for translation. 

Cognitive Services bring ML models into the hands of our developers without the need of building and training models from scratch. By simply calling an API, the app is enhanced with human-like abilities (seeing, hearing, speaking, searching) and accelerated decision-making. Cognitive Services offers all of the above while keeping a large choice of programming languages.

For the sake of further clarification, while we initially planned to use OCR on input images, upon the advice of the Microsoft experts, we ended up using Read API. It provided superior translations and also had the ability to directly access PDF documents. In this case, the expertise brought in by our colleagues from Microsoft was key to our choice of a more suitable technology.

TRANSLATION PROCEDURE

In our particular case, the arrival of a new email into a dedicated mailbox is the trigger for starting the analysis of the attachments (see Figure 2). After that, we iterate through all the attachments and store the IDs of the attachments.

Figure 2
Figure 2. The arrival of a new email triggers the analysis of the attachments.

Next, the metadata file for each email is created – including some details about the email (see Figure 3).

 Figure 3
Figure 3. Creation of the metadata file for each email.

The file hierarchy necessary for storing each message and its attachments is created in the Azure blob storage, as shown in Figure 4.

A new folder is created for each email, provided it has attachments. The folder name is the email ID. This folder contains subfolders for each attachment with the name being the attachment ID. Each subfolder contains a structure as shown in Figure 5, where:

- The ‘binary’ folder contains the original file received as attachment;
- The ‘read’ folder contains the JSON object with the text extracted by Read API/OCR;
- The ‘reconstruction’ folder contains a txt file with the translated text;
- The ‘metadata’ file contains details about the initial file.

Figure 4 
Figure 4. Azure blob storage for the attachments.

 Figure 5
Figure 5. The file hierarchy for storing the messages.

Finally, the blob for saving the email metadata is created (see Figure 6), and the second Logic App is called. The email ID and its attachment IDs are sent to the second application.

 Figure 6
Figure 6. Creation of the blob for saving the email metadata.

In the second Logic App, two functions are called for each attachment: one using Read API for extracting the text from the images, and the other one to translate the extracted text to English, as shown in Figure 7.

 Figure 7
Figure 7. The two functions called for each attachment are Read API and Translate.

Finally, the third Logic App is used for sending the email (see Figure 8).

 Figure 8
Figure 8. Function for sending the email.

The third Logic App, shown in Figure 9, receives the email ID from the previous app and provides the attachment IDs from the email metadata.


Figure 9 
Figure 9. Get attachment IDs from the email metadata.

For each attachment, we get the metadata and the reconstructed txt file from the storage and reconstruction folders, respectively, as shown in Figure 10.

 Figure 10
Figure 10. Obtain the metadata and the reconstructed txt file from the from storage and reconstruction folders.

The file content is stored in an object and the txt file gets the name of the original document received. In the final step, the reply for the initial email is created. It will contain the txt files with the translated text as attachments and a message in the email body.

 Figure 11
Figure 11. The reply for the initial email is created.

THE HACKATHON 

We had originally wanted to run this hackathon in a more traditional, face-to-face format at our Cluj office. However, due to current circumstances requiring us to work remotely, we had to be flexible. On the day of the hackathon, the four developers and a business analyst from Endava joined the two solution architects from Microsoft on the Teams video conference call (see Figure 12). The customer was also invited to join the event kick-off and the wrap-up meeting.

Figure 12 
Figure 12. The morning planning session.

The hackathon team consisted of:

Endava
  • Pavel Spataru
  • Dorin Bazgan
  • Daniel Moniry-Abyaneh
  • Jay Chitnis
  • Radu Orghidan
  • Bradley Howard
  • Razvan Berinde

Microsoft

The day started with an online setup session on Teams with everybody involved. The objectives were presented, and the details were quickly aligned with the customer. Then, the team split with each developer, tackling one or two tasks. The development process evolved throughout the day with the partners from Microsoft having one-on-one sessions with our colleagues. 

The demo with the customer was scheduled at 5:30 pm. An hour before, we had an all-hands for a technical status update.
  
As described below, the planned pipeline was presented during the demo.

First, the email containing the PDF document to be translated is sent (see Figure 13). 

 Figure 13
Figure 13. The PDF document to be translated is sent by email.

The document is received and processed in less than 30 seconds. The translation is returned to the sender as a text attachment. It can be compared with the original Spanish document for an assessment of the quality of the translation.

When solving a problem related to automated text recognition, it is important to understand the options offered by the Cognitive Services for printed and handwritten text in order to use the most suitable function for each case. In our situation, we needed to asynchronously process text-heavy content in both images and PDF while considering the context. The Read API ticked all the boxes and became our preferred option (see Figure 14).

 Figure 14
Figure 14. Context understanding using Read API.

The translation’s quality, provided by the Translator text API, can be enhanced by using Custom Translator, which enabled us to build customised dictionaries that can accurately solve the translation of medical terms (see Figure 15).

 Figure 15
Figure 15. Translation functionality.

A few examples of automated translations are shown in Figure 16. Details are also presented in Figure 17 and Figure 18.

 Figure 16
Figure 16. Side by side of translation and original document.


 Figure 17
Figure 17. Detail of the translation.


Figure 18 
Figure 18. Original document.

LESSONS LEARNED AND FUTURE WORK

We originally planned for a one-day event, but we extended it to a second day because we didn’t manage to properly tackle all the technical issues that appeared along the way and the testing of the final PoC. We now recommend running two-day events as a minimum, especially if they’re happening online.

The learning curve was flattened with the help of the experts from Microsoft. Their deep understanding of specific technologies perfectly complements our broader, but sometimes shallower expertise. The Microsoft architects didn’t write any code themselves, but instead helped by teaching us the principles of each service more quickly than it would have taken us to learn by ourselves. If you can, ask the vendor for technical mentors to help your teams adopt new technologies faster.

During the hackathon, we were able to focus on the problem and technology without interruptions. We recommend regular, but sparse check-up calls. Approximately three times a day should be enough.

A hackathon can result in an improved relationship with the customer. In our case, the customer was involved in the hackathon, in the requirements analysis, the creative process, and the demo session, and was able to appreciate our talent and hard work. The PoC is still used by the customer as a reference to compare the accuracy and speed of the results with the current solutions. This leads to new opportunities and insightful discussions about the technical possibilities.

The solutions that we envisaged initially were eventually enhanced by having the opportunity to discuss them with the Microsoft team, who brought alternative solutions to our attention. For example, the OCR functionality that we originally planned to use was replaced by the Read API, a better option that not only considers the context of the phrases, but also provides more accurate results. 

During the interaction with our team, the colleagues from Microsoft seised the opportunity to discuss other practical applications within different domains of strategic value for us such as insurance, fintech, multimedia, etc. which has inspired our team to host joint hackathons in our approach to developing healthy relationships with our customers and technical partners. Moreover, virtual hackathons, as opposed to in-person events, can increase the knowledge transfer speed without affecting social distancing.

CONCLUSION

Running a hackathon, even a virtual one, can be a fantastic way to accelerate the sales and delivery of a project by using new technology. We look forward to running more events like this in the future.

Radu Orghidan

VP Cognitive Computing

Radu is a technical business consultant with in-depth knowledge of innovation management. He leads cross-functional teams to achieve cutting edge technical objectives for our clients. His projects use data acquired from different sensors such as depth sensing cameras, mobile robots or microphones to create systems that enhance a users’ abilities to understand and interact with the environment… or sometimes to create 3D scans of his two kids. In his free time Radu loves running outdoors, skiing and being in the snow, or eating seafood while sharing a good bottle of wine.

 

Related Articles

  • 08 July 2020

    A Virtual Hackathon Together with Microsoft

  • 30 April 2020

    AR & ML Deployment in the Wild – A Story About Friendly Animals

  • 01 October 2019

    Cognitive Computing Using Cloud-Based Resources II

  • 17 September 2019

    Cognitive Computing Using Cloud-Based Resources

  • 20 August 2019

    Extracting Data from Images in Presentations

 

From This Author

  • 30 April 2020

    AR & ML Deployment in the Wild – A Story About Friendly Animals

  • 01 October 2019

    Cognitive Computing Using Cloud-Based Resources II

  • 17 September 2019

    Cognitive Computing Using Cloud-Based Resources

Most Popular Articles

A Virtual Hackathon Together with Microsoft
 

Innovation | Radu Orghidan | 08 July 2020

A Virtual Hackathon Together with Microsoft

Distributed SAFe PI Planning
 

Agile | Florin Manolescu | 30 June 2020

Distributed SAFe PI Planning

The Twisted Concept of Securing Kubernetes Clusters – Part 2
 

Architecture | Vlad Calmic | 09 June 2020

The Twisted Concept of Securing Kubernetes Clusters – Part 2

Performance and security testing shifting left
 

Testing | Alex Gatu | 15 May 2020

Performance and security testing shifting left

AR & ML Deployment in the Wild – A Story About Friendly Animals
 

Augmented Reality | Radu Orghidan | 30 April 2020

AR & ML Deployment in the Wild – A Story About Friendly Animals

Cucumber: Automation Framework or Collaboration Tool?
 

Automation | Martin Borba | 16 April 2020

Cucumber: Automation Framework or Collaboration Tool?

Challenges in creating relevant test data without using personally identifiable information
 

Testing | Alex Gatu | 25 February 2020

Challenges in creating relevant test data without using personally identifiable information

Service Meshes – from Kubernetes service management to universal compute fabric
 

DevOps | Oleksiy Volkov | 04 February 2020

Service Meshes – from Kubernetes service management to universal compute fabric

AWS Serverless with Terraform – Best Practices
 

Architecture | Vlad Cenan | 10 December 2019

AWS Serverless with Terraform – Best Practices

 

Archive

  • 08 July 2020

    A Virtual Hackathon Together with Microsoft

  • 30 June 2020

    Distributed SAFe PI Planning

  • 09 June 2020

    The Twisted Concept of Securing Kubernetes Clusters – Part 2

  • 15 May 2020

    Performance and security testing shifting left

  • 30 April 2020

    AR & ML Deployment in the Wild – A Story About Friendly Animals

  • 16 April 2020

    Cucumber: Automation Framework or Collaboration Tool?

  • 25 February 2020

    Challenges in creating relevant test data without using personally identifiable information

  • 04 February 2020

    Service Meshes – from Kubernetes service management to universal compute fabric

We are listening

How would you rate your experience with Endava so far?

We would appreciate talking to you about your feedback. Could you share with us your contact details?