On Document Visual Question Answering (DocVQA)

Contemporary Document Analysis and Recognition (DAR) research tends to focus on generic information extraction tasks (character recognition, table extraction, word spotting), largely disconnected from the final purpose the extracted information is used for. The DocVQA challenge seeks to inspire a “purpose-driven” point of view in Document Analysis and Recognition research, where the document content is extracted and used to respond to high-level tasks defined in natural language by the human consumers of this information. In this sense DocVQA constitutes a new problem space, where high-level semantic tasks dynamically drive information extraction algorithms to conditionally interpret document images.

DocVQA is modelled as a Visual Question Answering (VQA) problem where the task is to answer a natural language question asked on a single document image or a collection of digitised documents. This goes over and above passing a document image through OCR, and involves understanding all types of information conveyed by a document. Textual content (handwritten or typewritten), non-textual elements (marks, tick boxes, separators, diagrams), layout (page structure, forms, tables), and style (font, colours, highlighting), to mention just a few, are pieces of information that can be potentially necessary for responding to the question at hand. The DocVQA challenge is a continuous effort linked to various events.

The DocVQA challenge is a continuous, evolving effort, linked to various scientific events. The ICDAR 2021 competition is the second edition of the challenge, while a first edition was organized in the context of CVPR 2020 Workshop on Text and Documents in the Deep Learning Era.



The challenge comprises two tasks, one focused on answering questions asked on a single document and the other on answering questions posed over a collection of document images.

The challenge will be hosted at the Robust Reading Competition (RRC) portal. More details on the tasks and details on dataset and evaluation metrics will be updated in the portal according to the schedule below.

Important dates


  • 10th November 2020: Release of dataset(s)
  • 31th March 2021: Deadline for Competition submissions
  • 5 -10th September 2021: Results presentation at ICDAR

Terms and conditions


  • Each team must be registered in the RRC portal and the submissions from a team must be made from a single account.
  • You are free to use extra training data or employ pretraining / ensembling strategies to improve performance. But training on the validation set is not allowed.
  • All teams are requested to add at least a minimal description of their approach and their correct affiliation at the time of submission.
  • Before you download the data, make sure you read the terms and conditions concerning the use of respective datasets and download the dataset only if you agree to the terms.

