have you ever wondered how masscan chooses what order to scan targets? just doing it sequentially is obviously not ideal, but making a list of every target and then shuffling it would waste tons of memory.
it'd need some kind of function with a domain of how many things to shuffle, where each output has one corresponding input, and has random-looking behavior...

power of 2 trickery ¶
that sounds like a block cipher! let's make a little feistel network!

first we need a round function, this is what mixes stuff around to make the output seem random. the cool part about feistel networks is that this can be literally any pure function, and it'll still be invertible.
rounds = [0xe6, 0x21, 170, 59, 239]
def round_function(right, round):
return (right*round)^6
the feistel network itself is pretty simple, it just chops up the input, does a few iterations of flip flopping stuff around while xoring in the output of the round function, and then glues it back together.
def feistel(inp):
left = inp>>8
right = inp&255
for round in rounds:
left ^= round_function(right, round) & 255
left, right = right, left
return (left<<8)+right
this is has a 16 bit block size (as in has a domain of 0-65535), but it is pretty trivial to adjust the domain to any power of 2, although an odd number of bits needs a slightly more complicated "unbalanced" version.
except wait, masscan accepts arbitrary ranges of things to scan, that might not be a power of 2...
blackrock ¶
if you actually go looking in masscan's code for how it solves this, you'll find something called "blackrock". it's named after John Black and Phillip Rogaway, who created the technique of "cycle walking" to squish normal block ciphers into any arbitrary domain.

so how does cycle walking work? by feeding the output of a block cipher into itself until it fits!
def scramble(inp, size):
assert size <= 65536 and inp < size
r = feistel(inp)
while r >= size:
r = feistel(r)
return r
yup, that's it, that's the whole thing.
it doesn't even need to be a block cipher, any invertible function with a finite domain can be squished like this too.
what!? ¶
the reason this works is really clever. feeding the output back in like this must form a loop that ends up within the domain you want. having only one input that can get a specific output means it cannot repeat numbers until it reaches the initial input, and it must reach the initial input as its domain being finite means it'd run out of other numbers to choose. since the initial input would already be in the domain you want, it is guaranteed to find an output that fits.
oh also it can go backwards ¶
a nice side-effect of needing an invertible function is that you can invert it.
inverting a feistel network is as simple as running the rounds in reverse and xoring the round function'd left into the right instead of the right into the left.
def unfeistel(inp):
left = inp>>8
right = inp&255
for round in reversed(rounds):
right ^= round_function(left, round) & 255
right, left = left, right
return (left<<8)+right
unscramble is the exact same as scramble, except for using the inverse feistel network.
def unscramble(inp, size):
assert size <= 65536 and inp < size
r = unfeistel(inp)
while r >= size:
r = unfeistel(r)
return r
let's test it out ¶
sticking the output into a list like this does defeat the point, but whatever
if __name__ == "__main__":
shuffled = [scramble(h, 200) for h in range(200)]
print("woah some questionably-shuffled numbers", shuffled)
gotta make sure it is actually shuffled
ordered = list(range(200))
assert shuffled != ordered
test out undoing the shuffling for good measure
unshuffled = [unscramble(shuffled[h], 200) for h in range(200)]
assert unshuffled == ordered