4202
Comment:
|
← Revision 103 as of 2018-05-25 21:17:53 ⇥
14276
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
This page is about the reviews for the ESA 2018 Track B PCs. We expect around 50 submissions. Each submission should receive 3 reviews (more reviews are possible, but this is the exception). Each of the two PCs has exactly 12 members. This is an expected load of around 12 submissions per PC member. | This page describes the reviewing "algorithm" for each of the two [[ESA2018Experiment/SelectionOfPCs|ESA 2018 Track B PCs]]. Because of the experiment and because it's a good idea anyway, we try to specify it beforehand. We will use a relative standard algorithm: three independent reviews per submission with a score from {−2, −1, 0, +1, +2} each, followed by a discussion phase, where the reviewers discuss with each other and submissions are proposed for acceptance or rejection in rounds. Though standard and used in many program committees, it is not easy to give a full specification of this algorithm. The basic procedure is clear, but there are many eventualities, most of which will not happen, but some of which will, and it's hard to say in advance which. There is also a fair amount of complex human judgement involved that is hard to formalize. And there are many variations. We try to be as specific as possible, without being overly complicated or impractical. The result is not a 100% complete and precise specification of the process. We will fill in the gaps and fix problems in a reasonable way as we go along. As far as the experiment is concerned, these conditions are not perfect, but reasonable given the complexity of the process and the agents involved. That being said, please let [[https://ad.informatik.uni-freiburg.de/staff/bast|me]] know if you see a way to improve the following specification. <<TableOfContents(2)>> = Schedule = The total time for the reviewing process (from the submission deadline to the author notification) is 8 weeks. The reviewing process proceeds in the following phases: {{{#!html <p style="color: darkblue"> 1. The deadline for submissions is April 22 AoE (strict)<br/> 2. Bidding and paper assignment: ~ 1 week (April 24 - April 27)<br/> 3. Reviewing: ~ 4 weeks (April 28 - May 24)<br/> 4. Discussion and recalibration of reviews: ~ 2 weeks (May 25 - June 10)<br/> 5. Buffer for things going wrong or taking longer than expected: 1 week<br/> 6. The notification deadline is June 18 (maybe earlier)</p/ }}} = Reviews = We expect around 50 submissions. Each submission should receive 3 reviews (more reviews are possible, but this is the exception). Since each of the two PCs has exactly 12 members, this is an expected load of around 12 submissions per PC member. == Sub-reviewers and Conflicts of Interest == We recommend that you review the submissions yourself, but you may ask sub-reviewers for some of the submissions if you prefer to do so. In any case, you should familiarize yourself with each submission assigned to you and its review, so that you can have a competent discussion with the other PC members. The discussion phase is an essential part of the reviewing process. During the bidding phase you can and should also identity any Conflict of Interest (CoI) with any submission. If you make use of sub-reviewers, you should make sure that they do not have a CoI either. In case of doubt, they should write something about this as part of their review, and the respective PC member should add this part in the ''Comments to the PC'' field in !EasyChair. Typical cases for a CoI for a submission are: {{{#!html <p style="color: darkblue"> 1. One of the authors is your relative/significant other<br/> 2. One of the authors has been your advisor or PhD student in the last 10 years<br/> 3. One of the author comes from the same department<br/> 4. You feel there is a CoI for another reason (for example, you have many joint publications)</p> }}} |
Line 4: | Line 44: |
<<TableOfContents(1)>> | == Guidelines for the Review Text == |
Line 6: | Line 46: |
= Sub-reviewers = We recommend that you review the papers yourself, but you may ask sub-reviewers for some of the submissions if you prefer to do so. In any case, you should familiarize yourself with the paper and the review, so that you can have a competent discussion with the other PC members. The [[http://ad-wiki.informatik.uni-freiburg.de/research/ESA2018Experiment/DiscussionPhase|discussion phase]] is an essential part of the reviewing process. = Review Text = Each review ''must'' provide the following information: |
Each review ''should'' provide the following information: |
Line 15: | Line 49: |
<div style="display: inline-block; background-color:#fefefe; text-align: left; vertical-align: middle; padding:10px 20px"> | <!-- <div style="display: inline-block; border: 2px solid #f0f0f0; background-color:#f9f9f9; text-align: left; vertical-align: middle; padding:10px 20px"> --> <p style="color: darkblue"> |
Line 19: | Line 54: |
2.2 The weaknesses should be numbered (W1), (W2), ...</div> | 2.2 The weaknesses should be numbered (W1), (W2), ...</p> |
Line 22: | Line 57: |
Each review also ''can'' provide the following information (the authors will thank you): | The purpose of the numbering is so that it is easy to reference these items in the discussion. The numbering can but does not have to express a relative ranking of the strengths and weaknesses. Each review also ''can'' provide the following information (the authors will be grateful to you): |
Line 30: | Line 67: |
You can change the text of your review in the [[http://ad-wiki.informatik.uni-freiburg.de/research/ESA2018Experiment/DiscussionPhase|discussion phase]]. However, the discussion phase (and the whole reviewing process) will not work, if your initial review is not substantial. | You can change your review text in the discussion phase. However, the discussion (and the whole reviewing process) will not work, if your initial review is not substantial. |
Line 32: | Line 69: |
= Review Score = | == Guidelines for the Review Score == |
Line 34: | Line 71: |
Each review should provide one of the following scores. Just like the text of your reviews, these scores are important for the [[http://ad-wiki.informatik.uni-freiburg.de/research/ESA2018Experiment/DiscussionPhase|discussion phase]]. You can change your scores during the discussion phase, but it will greatly help the efficieny and quality of the process, if you hit the "right" score for a paper already in your initial review. | Each review should provide one of the following scores. Just like the text of your reviews, these scores are important for the discussion phase. You can change your scores during the discussion phase, but it will greatly help the efficiency and quality of the process, if you hit the "right" score for a paper already in your initial review. |
Line 36: | Line 73: |
|| '''Score''' || '''Verdict''' || '''Behavior during discussion''' || || +2 (accept) || Good fit and no major weaknesses || I would champion this paper and fight against rejection || || +1 (weak accept) || Significant weaknesses, but nothing fatal || I would support this paper, but not fight against rejection || || 0 (borderline) || Hovering between +1 and −1 || Not sure yet about the severity of the weaknesses / the threshold for ESA || || −1 (weak reject) || Significant weaknesses, but nothing fatal || I would not support this paper, but not fight against acceptance || || −2 (reject) || Bad fit or major weaknesses || I would oppose this paper and fight against acceptance || |
{{{#!html <table style="color: darkblue"> <tr><th>Score</th><th>Assessment of submission</th><th>Behavior during discussion</th></tr> <tr><td>+2 (accept)</td><td>Good fit and no major weaknesses</td><td>I would champion this paper and fight against rejection</td></tr> <tr><td>+1 (weak accept)</td><td>Significant weaknesses, but still acceptable</td><td>I would support this paper, but not fight against rejection</td></tr> <tr><td> 0 (borderline)</td><td>Hovering between +1 and −1</td><td>Not sure about the severity of the weaknesses / the threshold for ESA</td></tr> <tr><td>−1 (weak reject)</td><td>Significant weaknesses, lean to reject</td><td>I would not support this paper, but not fight against acceptance</td></tr> <tr><td>−2 (reject)</td><td>Bad fit or major weaknesses</td><td>I would oppose this paper and fight against acceptance</td></tr></table> }}} |
Line 43: | Line 83: |
Remark 1: Some conference also have +3 (strong accept) and −3 (strong reject). Experience shows that they are of little use for deciding on the set of accepted papers for a moderate number of submissions, as in ESA Track B (around 50). | Some conferences also have +3 (strong accept) and −3 (strong reject). Experience shows that they are of little use for deciding on the set of accepted papers for a moderate number of submissions, as in ESA Track B (around 50). |
Line 45: | Line 85: |
Remark 2: Some conferences disallow the borderline score of 0, to enforce a clear opininion on the reviewer. In the discussion phase, we indeed ask revievers to commit to one of the other scores. But for the first review, we think it makes sense to allow this score, because it reflects one of the typical sentiments about a paper at this stage of the reviewing process, as expressed by the ''hovering between +1 and −1'' in the table above. | Some conferences disallow the borderline score of 0, to enforce a clear opinion on the reviewer. In the discussion phase, we indeed ask revievers to commit to one of the other scores. But for the first review, we think it makes sense to allow this score, because it reflects one of the typical sentiments about a paper at this stage of the reviewing process, as expressed by the "Hovering between +1 and −1" in the table above. |
Line 47: | Line 87: |
Remark 2: Reviewers might not be fully aware yet of their behavior during the discussion phase for various reasons (for example: not sure about some aspects of the paper, not sure about the nature of the threshold for ESA Track B, general inexperience in reviewing). This can make choosing the right score difficult. But this is exactly one of the tasks of the discussion phase (described below): to bring the final scores (and reviews) closer to what they were supposed to reflect. | Reviewers might not be fully aware yet of their behavior during the discussion phase for various reasons (for example: not sure about some aspects of the paper, not sure about the nature of the threshold for ESA Track B, general inexperience in reviewing). This can make choosing the right score difficult. It is exactly one of the tasks of the discussion phase (described right next) to bring the final scores (and reviews) closer to what they are supposed to reflect. = Discussion Phase = The discussion phase starts as soon as all the reviews are in. It lasts approximately two weeks; see the [[ESA2018Experiment/ReviewingAlgorithm#Schedule|schedule]] above. == Beginning of the Discussion Phase == At the beginning of the discussion phase, each PC member should do the following (all the discussion and communication happens within !EasyChair): {{{#!html <p style="color: darkblue"> 1. Read the reviews from the other reviewers<br/> 2. Comment on contrary arguments or ask questions if something is unclear<br/> 3. Adapt your review and possibly the score to what you have learned from the discussion<br/> 4. <u>If your initial score was 0, change it away from 0</u> based on what you have learned from the other reviews and from the discussion</p> }}} == Groups of submissions == To specify the decision process, it is useful to define the following partitioning into ''groups''. Except for Group X (which hopefully will be empty), the descriptions assume that there are at least three reviews for each submission. The description in parentheses says what is likely to happen to a submission in this group. This will be described in more detail in the next section. {{{#!html <p style="color: darkblue"> <b>Group A1 :</b> clear support (will probably be accepted)<br/> <b>Group A2 :</b> at least one champion + weak support from the others (good chance to be accepted)<br/> <b>Group A3 :</b> only weak support from everybody (might be accepted if room)</p> <p style="color: darkblue"> <b>Group C1 :</b> weak support + weak or strong opposition (resolve or reject)<br/> <b>Group C2 :</b> strong support + weak opposition (resolve or vote in the end)<br/> <b>Group C3 :</b> strong support + strong opposition (resolve or vote in the end)</p> <p style="color: darkblue"> <b>Group R1 :</b> strong opposition (will almost certainly be rejected)<br/> <b>Group R2 :</b> mix of strong and weak opposition (will probably be rejected)<br/> <b>Group R3 :</b> weak opposition from everybody (will probably be rejected)</p> <p style="color: darkblue"> <b>Group X :</b> two of the reviews are missing or completely lack substance (aquire missing/additional reviews)</p> }}} The assignment of a submission to one of these groups will not be done by score alone, but also based on what is written in the reviews. Of course, there will be a strong correlation to the scores. In fact, if the scores were perfect, the correlation would be perfect. But it lies in the nature of the process that some reviewers (and PC members) are unsure about a submission or about the threshold for ESA. So one important part of the discussion phase is to bring the scores closer to what they are intended to reflect. For example, a submission with scores {+2, +2, +2} will probably be in Group A1 (unless the support expressed in the reviews is weaker than it might appear from the scores, in which case Group A2 or even A3 might be more appropriate), and a submission with scores {−2, −2, −2} will probaly be in Group R1 (unless the reviews are more positive about the paper than it might appear from the scores, in which case Groups R2 or R3 or even C1 might be more appropriate). Submissions can change groups at any time due to the ongoing discussions and corresponding changes in the reviews and/or scores. The group assignment of a submission can also be challenged by other PC members (who did not write one of the three original reviews for the submission). For example, if another PC member formulates an argument against a submission from Group A2, that submission will go into Groups C2 or C3. No decision is final until the end of the discussion phase. == Decision Process (Rounds) == After the preparation above (or partly in parallel to it), the discussion will proceed in rounds. Each round lasts several days. In each round, the PC chair will suggest certain submissions for acceptance and others for rejection. In !EasyChair, these submissions will be marked ''accept?'' and ''reject?''. PC members can challenge these suggestions until the next round. In each round that is not the first, submissions that were marked ''accept?'' or ''reject?'' in the previous round and that were not challenged, will be marked ''ACCEPT'' and ''REJECT''. If nobody challenges these decisions anymore, these will become the final decisions for these submissions. Submissions that have changed groups, will be treated like they would have been treated within that group in a previous round. For example, if for a submission from Group C2 (strong support + weak opposition) the opposition crumbles, the submission moves to Group A2 and will be suggested for ''accept?'' in the next round. Or, if for a submission from Group C1 (weak support + strong opposition) the support crumbles, the submission moves to Group R2 and will be suggested for ''reject?'' in the next round. {{{#!html <p style="color: darkblue"> <b>Round 1 :</b> A1 → accept?, R1 → reject?, A2 and A3 and C1 → push for (more) champions, C2 → challenge opposition, C3 → push for resolution<br/> <b>Round 2 :</b> A2 → accept?, R2 → reject?, A3 and C1 → push for champion, C2 → challenge opposition, C3 → push for resolution<br/> <b>Round 3 :</b> R3 → reject?, A3 → push for champion, C1 → reject?, C2 → challenge opposition, C3 → push for resolution<br/> <b>Round 4 :</b> A3 and C2 and C3 → send email to PC with short summary for each of these + call for vote<br/> <b>Round 5 :</b> Suggestion for final decisions<br/> <b>Round 6 :</b> Finalize decisions</p> }}} There should be as few submissions as possible left in Groups C2, C3 by the end of Round 3. The vote is really just an emergency measure for submissions, where (despite all attempts), the controversy could not be resolved. At the beginning of each round, the approximate number of "free slots" (total number of submissions that can be accepted minus the number of submissions already marked ''ACCEPT'') will be announced. This can of course influence the discussions and the fates of the submissions. = Change Log = A link to this web page was first sent to the PCs on April 20 00:41 UTC. The following list mentions only significant changes in content. Purely editorial changes or improvements in formulations which do change the meaning significantly are not logged. April 20 16:30 UTC: added information about conflicts of interest (for PC members and sub-reviewers) and added "approximate" to the last paragraph in the description of the disscusion phase to make sure that PC members cannot deduce information about their CoI submissions from the current status quo. April 21 15:12 UTC: made partition into groups slightly more fine-grained and more complete. Adapted the descriptions of the rounds accordingly. May 25 20:17 UTC: clarified that category C1 is for combination of weak support and ''either weak or strong'' opposition. |
This page describes the reviewing "algorithm" for each of the two ESA 2018 Track B PCs.
Because of the experiment and because it's a good idea anyway, we try to specify it beforehand. We will use a relative standard algorithm: three independent reviews per submission with a score from {−2, −1, 0, +1, +2} each, followed by a discussion phase, where the reviewers discuss with each other and submissions are proposed for acceptance or rejection in rounds.
Though standard and used in many program committees, it is not easy to give a full specification of this algorithm. The basic procedure is clear, but there are many eventualities, most of which will not happen, but some of which will, and it's hard to say in advance which. There is also a fair amount of complex human judgement involved that is hard to formalize. And there are many variations.
We try to be as specific as possible, without being overly complicated or impractical. The result is not a 100% complete and precise specification of the process. We will fill in the gaps and fix problems in a reasonable way as we go along. As far as the experiment is concerned, these conditions are not perfect, but reasonable given the complexity of the process and the agents involved. That being said, please let me know if you see a way to improve the following specification.
Contents
Schedule
The total time for the reviewing process (from the submission deadline to the author notification) is 8 weeks. The reviewing process proceeds in the following phases:
1. The deadline for submissions is April 22 AoE (strict)
2. Bidding and paper assignment: ~ 1 week (April 24 - April 27)
3. Reviewing: ~ 4 weeks (April 28 - May 24)
4. Discussion and recalibration of reviews: ~ 2 weeks (May 25 - June 10)
5. Buffer for things going wrong or taking longer than expected: 1 week
6. The notification deadline is June 18 (maybe earlier)
Reviews
We expect around 50 submissions. Each submission should receive 3 reviews (more reviews are possible, but this is the exception). Since each of the two PCs has exactly 12 members, this is an expected load of around 12 submissions per PC member.
Sub-reviewers and Conflicts of Interest
We recommend that you review the submissions yourself, but you may ask sub-reviewers for some of the submissions if you prefer to do so. In any case, you should familiarize yourself with each submission assigned to you and its review, so that you can have a competent discussion with the other PC members. The discussion phase is an essential part of the reviewing process.
During the bidding phase you can and should also identity any Conflict of Interest (CoI) with any submission. If you make use of sub-reviewers, you should make sure that they do not have a CoI either. In case of doubt, they should write something about this as part of their review, and the respective PC member should add this part in the Comments to the PC field in EasyChair. Typical cases for a CoI for a submission are:
1. One of the authors is your relative/significant other
2. One of the authors has been your advisor or PhD student in the last 10 years
3. One of the author comes from the same department
4. You feel there is a CoI for another reason (for example, you have many joint publications)
Guidelines for the Review Text
Each review should provide the following information:
1. A short summary of the main contribution(s) of the submission in the words of the reviewer
2. An itemized list of the strength and weaknesses of the submission
2.1 The strengths should be numbered (S1), (S2), ...
2.2 The weaknesses should be numbered (W1), (W2), ...
The purpose of the numbering is so that it is easy to reference these items in the discussion. The numbering can but does not have to express a relative ranking of the strengths and weaknesses.
Each review also can provide the following information (the authors will be grateful to you):
3. More detailed explanations of the strengths and weaknesses
4. Comments to the authors for improving the paper
You can change your review text in the discussion phase. However, the discussion (and the whole reviewing process) will not work, if your initial review is not substantial.
Guidelines for the Review Score
Each review should provide one of the following scores. Just like the text of your reviews, these scores are important for the discussion phase. You can change your scores during the discussion phase, but it will greatly help the efficiency and quality of the process, if you hit the "right" score for a paper already in your initial review.
Score | Assessment of submission | Behavior during discussion |
---|---|---|
+2 (accept) | Good fit and no major weaknesses | I would champion this paper and fight against rejection |
+1 (weak accept) | Significant weaknesses, but still acceptable | I would support this paper, but not fight against rejection |
0 (borderline) | Hovering between +1 and −1 | Not sure about the severity of the weaknesses / the threshold for ESA |
−1 (weak reject) | Significant weaknesses, lean to reject | I would not support this paper, but not fight against acceptance |
−2 (reject) | Bad fit or major weaknesses | I would oppose this paper and fight against acceptance |
Some conferences also have +3 (strong accept) and −3 (strong reject). Experience shows that they are of little use for deciding on the set of accepted papers for a moderate number of submissions, as in ESA Track B (around 50).
Some conferences disallow the borderline score of 0, to enforce a clear opinion on the reviewer. In the discussion phase, we indeed ask revievers to commit to one of the other scores. But for the first review, we think it makes sense to allow this score, because it reflects one of the typical sentiments about a paper at this stage of the reviewing process, as expressed by the "Hovering between +1 and −1" in the table above.
Reviewers might not be fully aware yet of their behavior during the discussion phase for various reasons (for example: not sure about some aspects of the paper, not sure about the nature of the threshold for ESA Track B, general inexperience in reviewing). This can make choosing the right score difficult. It is exactly one of the tasks of the discussion phase (described right next) to bring the final scores (and reviews) closer to what they are supposed to reflect.
Discussion Phase
The discussion phase starts as soon as all the reviews are in. It lasts approximately two weeks; see the schedule above.
Beginning of the Discussion Phase
At the beginning of the discussion phase, each PC member should do the following (all the discussion and communication happens within EasyChair):
1. Read the reviews from the other reviewers
2. Comment on contrary arguments or ask questions if something is unclear
3. Adapt your review and possibly the score to what you have learned from the discussion
4. If your initial score was 0, change it away from 0 based on what you have learned from the other reviews and from the discussion
Groups of submissions
To specify the decision process, it is useful to define the following partitioning into groups. Except for Group X (which hopefully will be empty), the descriptions assume that there are at least three reviews for each submission. The description in parentheses says what is likely to happen to a submission in this group. This will be described in more detail in the next section.
Group A1 : clear support (will probably be accepted)
Group A2 : at least one champion + weak support from the others (good chance to be accepted)
Group A3 : only weak support from everybody (might be accepted if room)
Group C1 : weak support + weak or strong opposition (resolve or reject)
Group C2 : strong support + weak opposition (resolve or vote in the end)
Group C3 : strong support + strong opposition (resolve or vote in the end)
Group R1 : strong opposition (will almost certainly be rejected)
Group R2 : mix of strong and weak opposition (will probably be rejected)
Group R3 : weak opposition from everybody (will probably be rejected)
Group X : two of the reviews are missing or completely lack substance (aquire missing/additional reviews)
The assignment of a submission to one of these groups will not be done by score alone, but also based on what is written in the reviews. Of course, there will be a strong correlation to the scores. In fact, if the scores were perfect, the correlation would be perfect. But it lies in the nature of the process that some reviewers (and PC members) are unsure about a submission or about the threshold for ESA. So one important part of the discussion phase is to bring the scores closer to what they are intended to reflect.
For example, a submission with scores {+2, +2, +2} will probably be in Group A1 (unless the support expressed in the reviews is weaker than it might appear from the scores, in which case Group A2 or even A3 might be more appropriate), and a submission with scores {−2, −2, −2} will probaly be in Group R1 (unless the reviews are more positive about the paper than it might appear from the scores, in which case Groups R2 or R3 or even C1 might be more appropriate).
Submissions can change groups at any time due to the ongoing discussions and corresponding changes in the reviews and/or scores.
The group assignment of a submission can also be challenged by other PC members (who did not write one of the three original reviews for the submission). For example, if another PC member formulates an argument against a submission from Group A2, that submission will go into Groups C2 or C3.
No decision is final until the end of the discussion phase.
Decision Process (Rounds)
After the preparation above (or partly in parallel to it), the discussion will proceed in rounds. Each round lasts several days. In each round, the PC chair will suggest certain submissions for acceptance and others for rejection. In EasyChair, these submissions will be marked accept? and reject?. PC members can challenge these suggestions until the next round. In each round that is not the first, submissions that were marked accept? or reject? in the previous round and that were not challenged, will be marked ACCEPT and REJECT. If nobody challenges these decisions anymore, these will become the final decisions for these submissions.
Submissions that have changed groups, will be treated like they would have been treated within that group in a previous round. For example, if for a submission from Group C2 (strong support + weak opposition) the opposition crumbles, the submission moves to Group A2 and will be suggested for accept? in the next round. Or, if for a submission from Group C1 (weak support + strong opposition) the support crumbles, the submission moves to Group R2 and will be suggested for reject? in the next round.
Round 1 : A1 → accept?, R1 → reject?, A2 and A3 and C1 → push for (more) champions, C2 → challenge opposition, C3 → push for resolution
Round 2 : A2 → accept?, R2 → reject?, A3 and C1 → push for champion, C2 → challenge opposition, C3 → push for resolution
Round 3 : R3 → reject?, A3 → push for champion, C1 → reject?, C2 → challenge opposition, C3 → push for resolution
Round 4 : A3 and C2 and C3 → send email to PC with short summary for each of these + call for vote
Round 5 : Suggestion for final decisions
Round 6 : Finalize decisions
There should be as few submissions as possible left in Groups C2, C3 by the end of Round 3. The vote is really just an emergency measure for submissions, where (despite all attempts), the controversy could not be resolved.
At the beginning of each round, the approximate number of "free slots" (total number of submissions that can be accepted minus the number of submissions already marked ACCEPT) will be announced. This can of course influence the discussions and the fates of the submissions.
Change Log
A link to this web page was first sent to the PCs on April 20 00:41 UTC. The following list mentions only significant changes in content. Purely editorial changes or improvements in formulations which do change the meaning significantly are not logged.
April 20 16:30 UTC: added information about conflicts of interest (for PC members and sub-reviewers) and added "approximate" to the last paragraph in the description of the disscusion phase to make sure that PC members cannot deduce information about their CoI submissions from the current status quo.
April 21 15:12 UTC: made partition into groups slightly more fine-grained and more complete. Adapted the descriptions of the rounds accordingly.
May 25 20:17 UTC: clarified that category C1 is for combination of weak support and either weak or strong opposition.