Introduction

TAT-HQA (Tabular And Textual dataset for Hypothetical Question Answering) is a numerical QA dataset with hypothetical questions, i.e. questions containing an assumption beyond the original context. TAT-HQA aims to enable QA model the counterfactual thinking ability to imagine and reason over unseen cases based on the seen facts and counterfactual assumptions.

TAT-HQA has the following features in addition to the features of TAT-QA:

In total, TAT-HQA contains 8,283 questions associated with 2,758 hybrid contexts from real-world financial reports.

The following is an example of TAT-HQA along with its original TAT-QA example. The left box shows a factual tabular and textual context, where a normal question (the upper middle box) is asked. Based on the normal question, a hypothetical question is asked with an additional assumption "if the amount in 2019 was $132,935 instead", leading to a different answer (-30,000 vs. -29,253). The blue dashed boxes indicate the position that the assumption affects. The QA model is supposed to imagine a counterfactual context (the right dashed box) and reason over this imagined situation.

TAT-HQA Sample
For more information, please read our ACL 2022 paper [PDF].

Getting Started

{
  "table": {                                                          
    "uid": "a3e9cad512b8d3ff0cd6e50774007eeb",                        
    "table": [                                                         
      [
        "(In thousands of $)",
        "2019",
        "2018",
      ],
      [
        "Net Debt Receipts",
        "$  243,513",
        "$   30,300",
      ],
      ...
    ]
  },
  "paragraphs": [                                                        
    {
      "uid": "76573d23233ebdfc3c89609c6372e951",                    
      "order": 1,                                                        
      "text": "The most significant impacts of Hilli LLC VIE's 
               operations on our consolidated statements of income and 
               consolidated statements of cash flows, as of December 31,
               2019 and 2018, are as follows: ..."
    },
    ...
  ],
  "questions": [                                         # Examples of 1) an original question and 2) a hypothetical question.                                         
    {
      "uid": "b68356a11ee8d39571d44b087b1558c7",                                
      "order": 1,                                                        
      "question": "What was the change in net debt receipts between 2018 and 2019?",     
      "answer": 129,454,                                                  
      "derivation": "129,454 - 0",                                      
      "answer_type": "arithmetic",                       # The answer type including `span`, `spans`, `arithmetic`, `counting` and `counterfactual`.
      "answer_from": "table",                            # The source of the answer including `table`, `table` and `table-text`
      "rel_paragraphs": [                                                
      ],
      "req_comparison": false,                                           
      "scale": "thousand"                                # The scale of the answer including `None`, `thousand`, `million`, `billion` and `percent`
    }
    {
      "uid": "e982f2be72b37a222d61fe645df00168",                     
      "order": 2,                                                        
      "question": "What would be the change in net debt receipts between 2018 and 2019 if the amount in 2018 was 100,250 thousand instead?",     
      "answer": 29,204,                                                  
      "derivation": "129,454 - 100,250",                                      
      "answer_type": "arithmetic",                                      
      "answer_from": "table",                                           
      "rel_paragraphs": [                                                
      ],
      "req_comparison": false,                                           
      "scale": "thousand"                                                
      "rel_question": 1                                  # The order of the corresponding original question under the same table. 
    }
  ]
}

Leaderboard

Rank Model Name Team Name Exact Match F1 Created
- Human Performance - 84.1 90.8 -
1 RSTQA: Rounds Specified numerical reasoning for Table-Text QA NLP2CT 57.6 58.2 14 Dec 2023
2 TagOp - L2I NExT 54.4 +- 1.0 54.7 +- 1.0 13 May 2022

Submission

To evaluate your models, we have also made available the official evaluation script, To run the evaluation, use

python evaluate.py gold_data_path.json prediction_result_path.json 0

Predictions Format

The predictions file in JSON format contains a dictionary with question ids as keys and the predictions as values (each prediction shall include both `answer` and `scale` in an array). For example,

{
 "9337c3e6-c53f-45a9-836a-02c474ceac16": [
    "4.6",
    "percent"
  ],
  "c4170232-e89c-487a-97c5-afad45e9d702": [
    "16",
    "thousand"
  ],
  "d81d1ae7-363c-4b47-8eea-1906fef33856": [
    ["2018", "2019"],
    ""
  ]
  ...
}

The format of sample prediction file of TAT-QA is also suitable for TAT-HQA.

Submission

Please email the prediction file of the test set with the following information to us:

Please give us up to two weeks to evaluate your submission and we will add your model to the leaderboard.

Contact

For more information, please contact:

Please kindly cite our work if the dataset helps your research.


@inproceedings{li-etal-2022-learning,
    title = "Learning to Imagine: Integrating Counterfactual Thinking in Neural Discrete Reasoning",
    author = "Li, Moxin  and
      Feng, Fuli  and
      Zhang, Hanwang  and
      He, Xiangnan  and
      Zhu, Fengbin  and
      Chua, Tat-Seng",
    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = may,
    year = "2022",
    address = "Dublin, Ireland",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.acl-long.5",
    doi = "10.18653/v1/2022.acl-long.5",
    pages = "57--69"
}