Challenge guidelines

Details about the data, performance measure, baseline features and prediction results are provided in the baseline paper.

To get started

Registration to participate in the challenge is now closed – results of the challenge will be communicated during the Workshop, on October 23 – Room Lovelace, Computer History Museum, Mountain View, CA, USA. New requests to access the datasets are however still possible (please only send the corresponding EULA to the appropriate Data chair – see below for links and contact person), along with the possibility to submit predictions for performance evaluation on the test partition – strongly recommended for a fair comparison with the results reported in the challenge.

Please contact Fabien Ringeval by email ( to register your team. This email must be send from an institutional email address, with the object “AVEC’17″ and should include:

  • Team name, e.g., name of your research institute/university
  • Team short name (16 characters maximum)
  • Team leader (must be a permanent researcher)
  • Link of team leader’s homepage on the institution website
  • Team members, including affiliation and email address
  • Signed EULA(s)

The list of team members must be the same as those provided on the EULAs (see below), and due to the sensitive nature of the data it is not allowed to share the data with other people, including lab mates not included on the EULA. Upon registering your team you will be given a link, username and a password to access the files.

The EULA for the Depression Sub-Challenge can be downloaded from the DAIC-WOZ website

The EULA for the Emotion Sub-Challenge can be downloaded from the SEWA website

After downloading the data you can directly start your experiments with the train and development sets. Once you found your best method you should write your paper for the Workshop. At the same time you can compute your results per instance of the test set and send them to the organisers. We will then let you know your performance result. See below for more information on the submission process and the way performance is measured.

Depression Sub-Challenge data

The organisers provide the following data:

  • Audio data (.wav) –scrubbed
  • Audio features (.csv)
  • Transcripts (.csv) –scrubbed
  • Video features (sampled at 30FPS from videos)
    • Pixel coordinates for 68 facial landmarks (.txt)
    • World coordinates for 68 3D facial landmarks (.txt)
    • Gaze vector (.txt)
    • Headpose vector (.txt)
    • HOG features (.bin)
    • AU labels (.csv)
  • Depression labels and score per participant, based on PHQ8 questionnaire for the train and development splits (.csv)
  • Gender info for train, development and test splits.

More details can be found in relevant documentation at the DAIC-WOZ website.

Emotion Sub-Challenge data

The organisers provide the following data:

  • Audio data (.wav)
  • Video data (.avi)
  • Turn timings (.csv)
  • Audio features
    • 23 LLDs from the eGeMAPSv0.1a feature set (openSMILE)
    • 88 acoustic features from the eGeMAPSv0.1a feature set (openSMILE)
    • Bag-of-audio-words representation of the LLDs (openXBOW)
  • Video features
    • Face orientation – in terms of pitch, yaw and roll (in degrees)
    • Pixel coordinates for 10 eye points
    • Pixel coordinates for 49 facial landmarks
    • Bag-of-video-words representation of the normalised video features
  • Emotion labels (arousal, valence, and liking) with 100 ms frame rate
  • Gender and age for train and development splits.

More details can be found at the SEWA website.

Results submission

Each participant has up to five submission attempts. You can submit results until the final results deadline, which is before the camera ready version deadline. Your best results will be used in determining the winner of the challenge. Please send submissions by email to Stefan Scherer for the Depression Sub-Challenge, and to Maximilian Schmitt for the Emotion Sub-Challenge. Participants’ results should be sent as a single zip that includes the name of your team, and the number of this attempt, e.g.

For the Depression Sub-Challenge, the zip file should contain only one file: “test_prediction.csv”. The data in the results files themselves should be formatted the exact same way as the training/development gold standard label files, that is, one CSV (comma separated value) file named “test_prediction.csv” containing three attributes: participant_ID, the prediction in binary format, i.e. 0 = PHQ8 score < 10 or 1 = PHQ8 score >= 10, and the prediction of the PHQ8 score as a numeric value (CHALLENGE TARGET). The organisers will provide the average F1 score for both classes “depressed” and “not_depressed”, as well as mean absolute error (MAE) and root mean square error (RMSE), which will be used to rank participants. The overall accuracy, average precision, and average recall for binary scores will be provided as well, which can be used by the authors to further discuss their results in the paper accompanying their submission. Participants in the challenge are also strongly encouraged to provide confusion matrix results on the development set to discuss precision of the algorithms on either classes.

For the Emotion Sub-Challenge, the zip file should contain three directories: “arousal”, “valence”, and “liking”. The data in the results files themselves should be formatted the exact same way as the training/development gold standard label files. Their filenames should also be formatted in exactly the same way. The organisers will provide for each dimension the Concordance Correlation Coefficient, which will be used to rank participants. The Pearson’s correlation coefficient and the RMS error will be provided as well, which can be used by the authors to further discuss their results in the paper accompanying their submission. Those metrics of performance will be computed on the concatenation of all development/test instances (i.e., gold standard and prediction).

Important: the top-two performers of the challenge will be asked to submit their program to us (Depression Sub-Challenge: Stefan Scherer, University of Southern California, Emotion Sub-Challenge: Maximilian Schmitt, University of Passau) to verify the results, both on the original test set and extra hold-out data. The program may be delivered (partly) as an executable or e.g. encrypted Matlab code, and we will endeavour to cater for all possible variations in operating systems, etc., but we do ask you to be available in the period of 14 September – 5 October to work with our team in validating your results.

Submission policy

All papers submitted to AVEC must conform with the Double-Blind Review policy (see ACM MM Submission Policy), and must be formatted according to the acm-sigconf template which can be obtained from ACM proceedings style.

Papers can be of varying length from 6 to 8 pages, plus one additional page for the references.

In your submission, please refer to the baseline paper for details about the dataset and baseline results. This makes for a more readable set of papers, compared to each workshop paper repeating the same information.

Submissions must strictly adhere to page limits and Double-Blind Review policy. Papers exceeding the page limits will be rejected without review. The maximum allowed file size is 10 MB.

Papers should be submitted through the AVEC 2017′s easychair submission site. Deadline for paper submission is July 19 AoE (Anywhere on Earth), i.e., 12 hours behind UTC.

Frequently asked questions/made comments:

Where can I find information about the AVEC 2017 Challenge? Information about the challenge, the data we are releasing, performance metrics and baseline features can be found in the baseline paper, which is currently being prepared. Please note that the content of this paper may change until the camera ready deadline.

I have already downloaded the DAIC-WOZ/AVEC 2016 database, shall I send another EULA for AVEC 2017? Yes please. The username and password used for DAIC-WOZ won’t be the same for accessing the AVEC 2017 dataset, moreover, you may want to change the names of the lab mates that will use the data.

Can I submit my paper to another workshop/conference? You are free to submit your results to any venue. However, only papers submitted to AVEC will be used for ranking.

How many times can I submit my results? You can submit results five times. We will not count badly formatted results towards one of your five submissions.

I’ve submitted a set of results, when will I receive my scores? During an active challenge we strive to return scores within 24hrs during typical working days (Monday – Friday), however please try to be patient as this is subject to workload. Under no circumstances should you spam the organisers as this simply delays the process for all teams.

Can I have access to the mailing list of participants? Can you tell me the results of other teams? Absolutely not – registrations and results are not for public view. If there is sufficient demand, we may consider offering a separate opt-in mailing list for teams to discuss the challenge.

Can we use other datasets to pre-train the model? The use of external sources is allowed. Any other dataset can be used for training your models. Please however consider the description of these additional datasets when writing your paper.

Can we use the development and test data during training? Both training and development partitions can be used for training your models, as well as additional resources – see above. The test partition, for which the labels are unknown, must solely be used for testing your system. It is strictly forbidden to perform any annotation on the test partition! The top-two performers of the challenge will be asked to submit their program to us to verify the results, both on the original test set and extra hold-out data.

I have submitted all my results on the test partition, can I know my rank now? The rank of the participants will be announced at the end of the Workshop day, no need to send personal requests, we will keep the final results private until the end.

Will we be ranked with the mean of our best attempt, or will you first select our best scores for the emotional dimensions (arousal, valence, and liking) independently and then compute a mean? The ranking will be based on the mean of the best CCC score obtained on the three emotional dimensions independently of the attempt. So no need to submit a trial with the best systems tuned for each dimension. Comments on why a specific architecture might work significantly better than another are thus strongly encouraged in case such observations occur.