Differences between revisions 8 and 13 (spanning 5 versions)

This page provides details about the ESA 2018 Track B Experiment. ESA (European Symposium on Algorithms) is one of the premier conferences on algorithms. It has two tracks: Track A (Design and Analysis) and Track B (Engineering and Applications). The basic setup of the experiments is as follows: there will be two separate PCs for Track B, which will independently review all the submissions to Track B and independently decide on a set of papers to be accepted. After the PCs have done their work, the results will be compared both quantitatively (e.g., overlap in sets of accepted papers) and qualitatively (e.g., typical reasons for differences in the decisions of the two PCs). The results of this comparison will be published. Depending on the outcome, the set of accepted papers for Track B will either be the union of the set of accepted papers from the two independent PCs, or there will be a final round (outside of the experiment) discussing the submissions, where the two PCs reached different conclusions.

Contents

Selection of the two PCs
Reviewing Algorithm
1. Phases
2. Scores for a single review

Selection of the two PCs

Both PCs have 12 members. Both have the same PC chair. The complete list of members can be found here: http://algo2018.hiit.fi/esa/#pc . The PCs have been set up so as to have an identical distribution with respect to topic, age group, gender, continent in the following sense. The topics are only a rough categorization of what the respective PC members are working on (many work on more than one topic, and topics are not that clear cut anyway).

Gender: 8 men, 4 women
Age group: 2 x junior (PhD <= 5 years ago), 4 x relatively junior (PhD <= 10 years ago), 6 x senior
Continent: 8 x Europe, 4 x Americas (we tried Asia, but weren't successful, sorry for that)
Topic: 1 x parallel algorithms (junior), 2 x string algorithms (one less senior, one more senior), 2 x computational geometry (one junior, one senior), 2 x operations research (one junior, one senior), 5 x algorithms in general (three junior, two senior)

Reviewing Algorithm

The reviewing algorithm is essentially the same as in previous years. Because of the experiment and because it's a good idea anyway, we tried to specify it beforehand. However, this is not a 100% complete and precise specification of the process. The goal was to be as specific as possible with making the description overly complicated or impractical. We will fill in the gaps and fix problems in a reasonable way as we go along, taking care that we treat both PCs equally. As far as the experiment is concerned, these conditions are not perfect, but the appear OK and reasonable given the complexity of the process and the agents involved.

Phases

1. The deadline for submissions is April 22 AoE (strict) 2. Bidding and paper assignment: 1 week (~ April 23 - April 29) 3. Reviewing: 4 weeks (~ April 30 - May 27) 4. Discussion + recalibration of reviews: 2 weeks (~ May 28 - June 10) 5. Buffer for things going wrong or taking longer than expected: 1 week 6. Notification deadline is June 18 (maybe earlier)

Scores for a single review

In the reviewing phase, each review provides a textual assessment of the paper, along with one out five possible scores

Score	Verdict	Behavior during discussion
+2 (accept)	No major weaknesses	I would champion this paper and fight against a rejection
+1 (weak accept)	Significant weaknesses, but nothing fatal	I would support this paper, but not fight against rejection
&nbps;0 (borderline)	Hovering between +1 and −1	Not sure about the severity of the weaknesses / the threshold for ESA
−1 (weak reject)	Significant weaknesses, but nothing fatal	I am not supporting this paper, but would also not fight against acceptance
−2 (reject)	Major weaknesses	I am opposing this paper and fight against acceptance

Remark 1: Some conference also have +3 (strong accept) and −3 (strong reject). Experience shows that they are of little use for deciding on the set of accepted papers for a moderate number of submissions (around 50).

Remark 2: Reviewers might not be fully aware yet of their behavior during the discussion phase for various reasons (for example: not sure about some aspects of the paper, not sure about the nature of the threshold for ESA Track B, general inexperience in reviewing). This can make choosing the right score difficult. But this is exactly one of the tasks of the discussion phase (described below): to bring the final scores (and reviews) closer to what they were supposed to reflect.

-  ⇤ ← Revision 8 as of 2018-04-18 20:00:12 → 
  Size: 2057
  Editor: Hannah Bast
  Comment:
+   ← Revision 13 as of 2018-04-18 20:26:01 → ⇥
  Size: 4677
  Editor: Hannah Bast
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 1:
+#acl -All:read
-Line 16:
+Line 18:
-==
+The reviewing algorithm is essentially the same as in previous years. Because of the experiment and because it's a good idea anyway, we tried to specify it beforehand. However, this is not a 100% complete and precise specification of the process. The goal was to be as specific as possible with making the description overly complicated or impractical. We will fill in the gaps and fix problems in a reasonable way as we go along, taking care that we treat both PCs equally. As far as the experiment is concerned, these conditions are not perfect, but the appear OK and reasonable given the complexity of the process and the agents involved.

== Phases ==

1. The deadline for submissions is April 22 AoE (strict)
2. Bidding and paper assignment: 1 week (~ April 23 - April 29)
3. Reviewing: 4 weeks (~ April 30 - May 27)
4. Discussion + recalibration of reviews: 2 weeks (~ May 28 - June 10)
5. Buffer for things going wrong or taking longer than expected: 1 week
6. Notification deadline is June 18 (maybe earlier)

== Scores for a single review ==

In the reviewing phase, each review provides a textual assessment of the paper, along with one out five possible scores

|| '''Score'''      || '''Verdict''' || '''Behavior during discussion''' ||
|| +2 (accept)      || No major weaknesses || I would champion this paper and fight against a rejection ||
|| +1 (weak accept) || Significant weaknesses, but nothing fatal || I would support this paper, but not fight against rejection ||
|| &nbps;0 (borderline)   || Hovering between +1 and −1 || Not sure about the severity of the weaknesses / the threshold for ESA ||
|| −1 (weak reject) || Significant weaknesses, but nothing fatal || I am not supporting this paper, but would also not fight against acceptance ||
|| −2 (reject)      || Major weaknesses || I am opposing this paper and fight against acceptance ||

''Remark 1: Some conference also have +3 (strong accept) and −3 (strong reject). Experience shows that they are of little use for deciding on the set of accepted papers for a moderate number of submissions (around 50).''

''Remark 2: Reviewers might not be fully aware yet of their behavior during the discussion phase for various reasons (for example: not sure about some aspects of the paper, not sure about the nature of the threshold for ESA Track B, general inexperience in reviewing). This can make choosing the right score difficult. But this is exactly one of the tasks of the discussion phase (described below): to bring the final scores (and reviews) closer to what they were supposed to reflect.