2014 Judging System Feedback and Discussion

The FPA’s new judging system was unveiled at some spring 2014 competitions. If you have judged (or been judged) in this system, please discuss your impressions of the system.

Please be specific with your feedback.

If you have a critique, the ideal scenario is to describe the critique, why you believe the current situation is not optimal and how you would fix/improve it.

5 Replies to “2014 Judging System Feedback and Discussion”

Flo says:

June 23, 2014 at 12:50 am

Execution:
The new flexibility of Judges looks helpful and make totally sense for me. The Interpretation of errors should be task of the judges. I didn’t judge Ex yet, but I guess this gives you, as a judge more power and more responsibility to look exactly.. Otherwise than before, where you have an exact definition about each category of errors.

Difficulty:
The Consecutivity Bonus is a good Idea to sensitize the players for consecutivity, but it also overexerts judges in the short time of 15sec.
At beginner teams it is possible to keep care on consecutivity, but the combos of that teams are to short to give a positive bonus, and if they are fighting to hit a combo with their minor skills you don’t want to penalize them with a minus on their anyway low mark.
At the advanced teams you have to keep care so much about the moves, that it is nearly impossible to find two decisions within that short time-frame. We can keep this bonus system – but, don’t force the judges to do that..if there is an honorable situation they have the possibility to give the bonus and if they play on time and make long terming not consecutive moves then a minus is also OK, but writing two marks in 15 sec is one to much!

Time Phrase Judging: Good compromise between the natural and the time frame based solution! Although I got in trouble by watching and don’t be consequent by giving the mark at the expected time. (Please note: I was fully concentrated and didn’t drink or smoke anything like other players on the judging table)

Multiplicator: It seams that now Difficulty can reach the same points than the other categories. This is an improvement. But it is also possible to get more than 10 points (I had 2 Teams in a pool that got more than 10), that means that now Diff has more weight due to the multiplication. Teams with a strong Diff are even more favored?!

Artistic Impression:
The Variety Checklist is a very helpful instrument so sensitize the players to what is important in that category. But what happens was that the judges spend 4-5 min to fill it out and some who started with that category forgot about the most what happens in the other categories of AI. The breaks between each teams was to long. For the audience and also the players.
Maybe it is worth to think about to outsource Variety to an independent judge(s). Or hope, that after a time the judges are educated enough to don’t use the list.
I also heard about some players that are not OK with the classification of Tricks and points. Variety is much more and could not be wrote down in one sheet of paper.

Flow/Form: To split that Category and give them 10 points each, doesn’t make any problems by judging and also the results seems to be OK.

General Impression: This was a really break for me to scratch that category! I saw teams that really played well and was nice to watch. They have been Entertaining, creative, funny, and bring the audience to clap, smile and to stay! But at the end I didn’t find a category where I can honor that!
I talked with the reporter of Germanys biggest TV channel who filmed the German Championships.
He said to me: “Your problem is that the players outfits don’t look serious! They look like usual people in the park – nobody want to see that!”
Without General Impression:
– We don’t need team outfits. It is irrelevant today.
– Players can swear like a trooper an nobody care!
– Players can entertain the spectators, animate them, bring a short varying interaction – our actual judging system don’t care!
(that are only the excesses)

The major Argument for scratching General Impression is the subjectivity.
But I think a major problem of “subjectivity” is the missing knowledge and satisfying of the meaning of “Spirit of the game”!
(This is another topic we have to discuss separately – short: The SOTG is a Fair Play Price all Frisbee Disciplines belongs to; with the interpretation for Freestyle: Every judge have to judge as fair as possible!)
Eleonora says:

August 18, 2014 at 12:25 pm

During FPAW I had the opportunity to judge using the new system in all three categories. Here are my thoughts about the new judging system. I talked with Fabio and he shares my same concerns:

Execution: I like the idea of having more flexibility even though this means introducing more subjective judging (a drop that may not interrupt the flow for one judge, may interrupt it for someone else). This flexibility was already partially there in the choice between 0.1 and 0.2 deductions, but it’s now more extended and it lowers the weight of Execution.

Difficulty: I like the phrase – time block idea. To me, it is just more natural to write the mark when a pause occurs. However, this may lead to very different difficulty scores among the three judges and can create uneven difficulty blocks within the routine (one can last 11 seconds and the next one 19 seconds for example): this is something we should think about. When judging, I felt like the metronome was kind of distracting me. Maybe it would be better to simply decide a timeframe of 2-3 seconds the judge can write his/her mark after the voice says “mark”.

During Worlds, I noticed once more that one aspect which difficulty should take into account for outdoor tournaments is weather conditions. We all know how wind can suddenly change and affect a routine or maybe it can start raining during a routine, making the grass more slippery. I think the system should include this aspect as well, giving a bonus to teams who had to face harsh conditions as well as to teams who could face harsh conditions in a good way.

Pluses and minuses: I think it’s really hard to keep concentrated on giving two different marks for every time block. I talked with other judges and, in many cases, they had my same feeling. Consecutivity is already part of the difficulty mark: a consecutive move/co-op is by its own definition more difficult than a non-consecutive move/co-op. If we feel like this is not understood, we should simply emphasize it in the judging manual and explain it to newcomers, and eliminate this second mark from diff time blocks.

Overall, I feel like the multiplicator and the pluses/minuses system is now giving more weight to difficulty. In a 4 minutes routine, a team can get up to 16 extra points plus the multiplicator. The system is once again unbalanced and I get the impression that this makes it really hard for newcomers to rise and shine.

Artistic Impression: I like that Flow and Form are now two separate sub-categories. However I think this is the category that has undergone the most negative changes. Needless to say, I feel frustrated about how our mixed routine was judged in FPAW 2014 finals at the point I were wondering whether it does make sense to keep competing or not. I think most of the problems depend on the fact that now Artistic Impression is a more sterile category.

As I said during the FPAW meeting before the competition, I think the checklist for is too detailed and incomplete. It is hard to remember all the details of what happened in a 4-mins routine and it is easy to get stuck on the list, without considering aspects that may not be listed there. If we feel we wish to have a list anyway, one option would be to have broader categories such as Throws, Catches, Ambidexterity, Multiple discs, Two spins, etc. where the judge can simply put crosses but, then again, the list risks to be incomplete.

Concentrating on the checklist, the judge may forget that variety also means non-repetition. If team A does, say, 5 different catches using only the right hand, while team B does 4 catches using both right and left hand, the risk with the checklist is that the judge will give more variety to team A. In the end, I think everyone has its own method to judge variety. The list is fine in the judging manual to give an orientation, but while judging it looks like it is more harmful than good: freestyle cannot be reduced to a list of items.

In addition to all this, Variety is just one of the sub-categories of Artistic Impression: by making it such a detailed sub-category, judges may not give enough importance and time to the other sub-categories.

This level of detail and the disappearance of General Impression makes AI a more aseptic category. To me General Impression represents the soul of freestyle: it expresses not only the”Wow factor”, but also how the players move and play WITH the music, how they build a story, how they interpret it and the vibes their routine gives to the audience. It is the real bonus to creativity and innovation. I would like to see it back and clarified because I think that creativity and innovation are two important aspects of free-style and because I think that in the past this sub-category was misinterpreted. Cutting General Impression gives way to calculated routines. The essence of the sport seems to be fading away. Let us not forget that Frisbee Freestyle is a sport as well as an art and that it is also and especially meant for the audience.

Eleonora.
Arthur Coddington says:

August 18, 2014 at 2:26 pm

Apologies for the length of my comments. Judging system work is a thankless job, so I offer this critique with respect to the hard work the judging system committee poured into the project.

Overall Feedback
The committee fixed a system that wasn’t broken, and they fixed it without clearly articulated vision for the freestyle performances it will inspire. My visions for a judging system is one that encourages aggressive play and de-emphasizes theater. The audiences I’ve seen respond to freestyle want skateboarding, not ice dancing. They want to see our equivalent of Tony Hawk’s 900, not another interpretation of Bolero. The best system I’ve seen for this is from last year’s Beach Stylers (http://shrednow.com/2013/12/why-i-love-the-2013-beach-stylers-judging-system/). It rewarded execution and artistic impression while giving real incentive for teams to play closer to their limit. The FPA system needed a bold vision like the one Dave Schiller and Joel Rogers showed at Beach Stylers.

Diff Multiplier –> Balancing the impact of each category
In the past, the impact of difficulty has been muted. The point spread – the difference between the highest and lowest scores – was traditionally a fraction of the spread of AI and Ex. Evening out the point spreads is a good thing because teams should be rewarded for showing state of the art play as much as they are rewarded for avoiding mistakes or creating shows. I like the multiplier because it has improved the Difficulty point spread. In the Open Pairs final, the spread in Diff was slightly less than in the other 2 categories but improved from previous years. It lagged farther behind in the other three finals. We need more data on whether 1.5 is enough.

Consecutivity
I applaud the judging committee for addressing consecutivity in the revisions. Rewarding consecutivity says “playing consecutively both makes people more accomplished freestylers and presents the sport in a more appealing way. We should reward that.” Awesome! While I haven’t yet mastered giving pluses during difficulty judging, I like the concept.

What I dislike is the idea of giving minuses. A plus rewards teams for exceptional consecutive technique. A minus double dings teams – mostly intermediate teams – for playing at their level of skill. A lower-ranked team has already been given a lower score for showing their current level of play. To punish them further seems not only cruel but opposite to the spirit of encouragement and learning. Keep the plus, ditch the minus.

The Tick Tock Diff Method
One of my complaints with the judging revisions is around intention. Some of the changes felt like tinkering without a big picture intention. Providing an expanded time period for marking Difficulty scores is tinkering. It tried to fix something that wasn’t broken The old system offered the flexibility this new system claims to create. Judges could consider the next consecutive move in a combo once the “mark” had sounded. The new system creates a bigger problem. The big window creates extremely long blocks followed by extremely short blocks. Judges are waiting for a combo to end, so they can maybe move a score of 6 to 7. The cost is that the next block is short and ends up being artificially low, maybe even a 2. I experienced this dynamic a bunch of times. Over those 2 blocks, the team now scores 7+2 instead of 6+5 or 6+6. The score is not only no longer an accurate reflection of their play, it’s not judging play at a comparable pace to the other routines. In my opinion, some of the wonky diff scores in Medellin might have been caused by this. This new tick-tock approach needs to go.

Variety – Checklist
I love that there is more accountability in Variety. It was always an area where reputation could get a team more points than it actually earned. That said, the checklist needs more thought. I believe Variety should be a learning category, a map for players to expand their capabilities. And I believe Variety should be designed so most top teams score the maximum using their preferred style of play. The intention should be “did this team show a professional level of variety?” Most top teams should easily max out.

The new checklist is too much for judges to track a list of skills while judging all the other subcategories. In practice, the best we could do was take a few notes and share observations after the routine. More troublingly, the checklist reads like a consultant outside freestyle wrote it. Names of throws and catches are wrong. There are instructions to not pay attention to the number of things checked off, but the checklist structure – and the clear definition of points in each Variety subcategory – implies that quantity matters without communicating how much quantity is enough. Confusing.

Variety – Pieces of Flair/Styles of Play
I don’t know why the subcategory of Variety called Styles of Play exists. This list of disc skills that somehow gets special treatment. Going back to intention, the message seems to be “here, do unusual and strange things. there are bonus points to be earned.” The reality is that each item in Styles of Play is a disc skill just like each item in Disc Handling. In my opinion, none of them should be considered any more or less important than the other. A team should not have to use pieces of flair (be crazy and wacky) to qualify for full variety credit. Get rid of Styles of Play.

General Impression
I’m not sad to see this go. Too many points distributed with pure subjectivity. Too much opportunity for accomplished players to be outscored by charisma or theatrics. The tradeoff of being able to express appreciation for the whole performance is not worth the tradeoff of points given without accountability.

Defining What We’re Judging – Form and other concepts
There was apparently confusion about consecutivity at the players meeting in Medellin. There are a lot of debatable concepts in freestyle. When creating a judging system, we need to take a stand for what we think is important and define it clearly.

It goes back to intention – again. If we are measuring Form, we need to ask why. By measuring form, what do we want to encourage. If it’s ballet form, say it. If it’s maintaining balance and not falling down, say it. If it’s intentional body position, say it. Players should understand why we want to encourage whatever they are scoring, and they should have clear guidelines on how to score it.

Speed
This is more tournament administration, but it’s very related to everyone’s experience at an event. I’m less concerned with the speed of judges calculating their scores as with the dependence on technology after a round. I didn’t see judging slowing down all that much in Medellin. What I did see was an intention for every number to be checked and rechecked by computer before any results were released. While that intention of accuracy is awesome, the result was that players did not have a chance to review scores to express any concerns, they did not learn results on the same day of competition, and in the case of the finals the results were not announced until well after dark. Part of this was due to insufficient staffing, and part of it is a symptom of an over reliance on technology.

Accuracy serves players, and so does speed. There needs to be a balance. I have head judged rounds where we have had provisional results calculated five minutes after the last team performed. We collected scores after each team and tabulated the scores during the next performance. It works, and it still allows for computer checking of scores later on. As a player, I want faster turnaround.
Philipp Lenarz says:

August 19, 2014 at 4:43 am

Hi,

Actually I planned to leave this discussion to others, but after all the experiences we’ve made I feel I need to point out a few things. First, let me thank everybody for your constructive feedback. It was clear to the committee that the new system wouldn’t be perfect. Some changes work, others don’t. The proof of the pudding is in the eating; and that’s happening now. However, the way it’s happening now, I think the discussion will lead nowhere this way…

Some of the feedback given so far refers to the ‘technical’ implementation of changes and not the intention of the changes itself. Examples would be the pluses and minuses for consecutivity and handling the Variety checklist. These technical issues can be fixed I think. Other changes, however, create real dispute. These changes typically refer to the fundamental vision that each individual has on how the sport should be judged. Ask yourself a few questions:
– What is more valuable, “show” routines of “high diff” routines?
– Do you prefer complex and detailed judging that judges can be made accountable for, or do you prefer intuitive judging with immediate results?
– Do you want to make the sport more attractive to laymen playing short routines or do you want give players more time to present what they’ve worked on?

During the committee’s work, we’ve tried to get the best decisions on these and many other points. This turned out to become a long “battle” between groups of people with contradicting visions. In the end, democratic decisions are about compromises, so we ended up leaving the radical changes aside and fine-tuned what we had.

For me – as I realized during our work – a predominant vision of where to go is lacking among players. Where does the community want to go? I think this question is a lot about: Do we want to please the audience (quick results, short show routines) or do we want to please the players (detailed/objective judging, high diff, longer routines)? You cannot have both!

Summing up: Everybody weighing in their opinions – like it’s happening now – is really interesting but will not lead to real changes. The opinions are just too contracting. So for me the first step should be: Do we want real changes and who is ready to put effort into this? If so, we need to describe a few visions, vote on them and then follow that path resolutely accepting also the downsides of this path. If we can’t get a clear answer on which path to go, we could run some tournaments one and the other way and see what becomes most popular in the end.

Just my thoughts…

Cheers
Philipp

A side note on feasibility of the“pleasing the audience” approach: First, for quick computation and presentation of results you need proper (electronic) equipment. Ideally each judge needs to enter their results into a computer that is connected with a server that automatically computes the overall score. Moreover, you need a screen to display the scores. How can we guarantee this at events? Second, quick intuitive judging requires competent judges. You need judges who are well-trained and experienced regarding all the facets of the sport so they can decide quickly. Moreover, they need to understand and accept their responsibility in such an intuitive system that is more based on subjectivity. Consequently, judging education is key. That is not a valid point for quick judging only, but we are not working in this direction sufficiently I think. It’s a grassroots problem; nobody likes to deal with judging, everybody prefers to jam. Besides better judging education, having more judges and crossing out the high and low scores per category (as many other sports do) would also decrease subjectivity. But again you need to have a lot of competent and motivated judges for this.
Alex Leist says:

August 25, 2014 at 7:44 am

Hi here is Alex.
Just a short comment to Phillips sayings:
I think you are right when you say that we should have a discussion about where we want to go and what kind of play we want to promote if we talk about changing the judging system. Anyway when i talk aout the judging system or when i tried it, i usually just thougt about how i feel as a judge. Mostly this means how hard it was for me to judge. How difficult it was to give a mark that i can support. Thats the foundation of my view of the judging system changes and of my following comments.
Thanks again to the work of the judging system comitee!

Exe:

I think the changes are good even though they dont change a lot in the scores the players get. Advanced players now have a little bit more room for there marks due to interpretation of flow but as specific marks should only differ in the range of 0.1 from old system marks, the changes are in my opinion insignificant. One disadvantage is that exe gets a little bit more complicated to understand for newcomers.

Diff:

Consecutivity marks:

not good. It is not possible to look for the difficulty and give marks for consecutivity at the same time. Also giving minuses was hard because at some beginner teams i had the feeling of having to give minuses all the time despite they already had very low diff marks. An idea would be to give the judges the possibility to give a consecutivity bonus if they really found a combo very consecutive. Anyway the consecutivity marks have started a discussion about what it really is and how it should be judged what definitly is a good thing.

Diff multiplyer.

As james already said, it is nice to see diff scores that are as high as exe and ai scores. However, If it will really affects the way of playing or if it has any influence on the results is diffficult to say…

Hybrid marks:

I really appreciate the work of the Judging system commitee but i dont see any advantage or disadvantage in using the new diff marks (expect that you can maybe here them better). In my opinion the diff marks didnt get closer to natural phrase judging as they where before. Nobody made his mark precisely at the sound of the mark. As a judge i always made my mark about 1 to 5 seconds before or after the sound so for me there doesnt exist any change in using the new marks at all. Furthermore the problems we have in judging difficulty (for example that combos dont fit to the 15s frame and we have to make a mark in the middle of it) havent been touched. Still we are forced to make a mark in a certain period ( as it has been before). Knowing that this is a particulary discussion i just mention it here: I prefer natural phrase judging (and i dont see any problems using it).

AI:

variety checklist:

impossible to use appropriate. it just takes too much time. I also share the critics that have said before, that it is definitly not complete and so it is leading the judges attention on a few specific things. Anyway the list was good as a reminder of what variety could mean.

Flow and Form:

For me these categories are difficult to judge from 0 to ten. I can say ones form was bad, normal or very good but i cant say what would be the differnce between a 7 and an 8 for example. Of course in the old system the same problem occured but here it didnt have such an influence. Still i feel that form is an important part of Ai. I dont know how to solve the problem. One proposal could be to give a kind of form bonus instead of a mark from 0 to ten.
Flow in my opinion is also highly subjective and so giving it more inluence doesnt make the system more objective. Anyway i usually feel comfortable about my flow mark so i can support it.

General impression:

I definitely miss this sub category and even if it suffers from subjectivity it is important to absorb everything that the formal judging system doesnt cover but what the judges feel.

Comments are closed.