two
First Post
The main problem with skill challenges in 4e is that the entire design is flawed:
Skill challenges, as described in the DMG, output the ratio of successes to failures and compares that ratio to an effective threshold. Unfortunately, this means that the result is being normalized to the range 0 (no successes) to 1 (all successes): the spread around the most likely result goes as 1/sqrt(N) where N is the number of rolls. For largish (and it doesn't take that large) N, then, you have either virtually guaranteed success, virtually guaranteed failure or almost exactly 50% success WITH extreme sensitivity to small modifiers (a +/-1 on the DCs might take you from 50% success to 10/90% success). The extreme sensitivity to small modifiers is the core of what kills 4e's skill challenges.
Now let us take a better system (Stalker0's Obsidian system might be an example of what follows, I'm too lazy to relook it up). In this system, we work on an additive rule: the system outputs the total number of successes. Further, the range we compare the system to varies as sqrt(N). In the Obsidian system, this would mean that you expect a partial successes, and the width of the partial success range (total success-failure success numbers=constant*sqrt(N). Stalker0, if this isn't the case, it should be). What does this get us? Mainly the fact that the probability gradient is independent of N: our sensitivity to small modifiers does not depend on N. Under such a system, IF you correctly center everything (most individual rolls succeed near 50%, the expected success totals gives you the desired outcome, modifiers on individual rolls are +/- 4ish or less on a d20), the final success/failure probabilities will behave reasonably, NOT depending wildly on N.
In short, NO system based on success/failure ratios will either do well over a large range of N, or behave well for even moderate N. A good system using many rolls will need to have a total number of successes margin of error that scales as sqrt(N), however they choose to implement it.
So, finally, we want a system where:
(1) target number of successes T scales linearly with N, with the skill challenge difficulty delta T (as opposed to individual check difficulty) being a modifier on T that also scales with sqrt(N).
(2) success/failure/partial success windows that (when measured in successes) scale with sqrt(N).
(3) individual check success probability averaging near 50%, and modifiers around that number small (which keeps us in the linear response regime).
This post, which seems to have gone unnoticed, is actually one of the most sophisticated and interesting things I've ever read on EnWorld.
Thanks for posting.