In the room "Randomness": Mar 1 00:25 2001 from Jesse Mundis 2^3021377 - 1 Mar 1 00:26 2001 from Dave Is not so random... ;-) Mar 1 00:39 2001 from Jesse Mundis It could be.... Mar 1 00:48 2001 from gecko girl ...but it's not. Mar 1 01:27 2001 from Jesse Mundis Can a single number truly be said to be random? I'd think "randomness" is a quality of a sequence of numbers. something which produces "random numbers" produces a *sequence* with certain qualities of non-predictability. I *think* that individually, all numbers are equally random (or non-random). Josh, opinions? Mar 1 01:30 2001 from Josh Oh, I think that if we look at the spectra for this distribution, we'll find a rather substantial spike at 2^3021377 - 1. Mar 1 01:43 2001 from Jesse Mundis Well sure, but note you said "distribution." That implies some sort of set of data. I'm merely claiming that it's pretty meaningless to attribute randomness (or lack there of) to a particular number without the context of a given data set. If we say the data set is "numbers which appear on Usenet and web pages" then yes, I agree *within that set* my number is pretty un random with a big spike in the distribution. But within the platonic ideal set of all positive integers, it seems about as random as any other number. What I'm speculating on is that randomness isn't a quality of a number, but a quality of a set of numbers. Numbers, individually, without a surrounding set or context, really aren't random or non-random. Mar 1 02:00 2001 from Josh If you're trying to draw me into a serious discussion on the randomness of numbers, in general, I have a few comments. 1) The only general purpose way of verifying the actual randomness of a number is to know by what process it was selected. Your number (for instance) was likely pulled out of your ass. Unless there is something I don't know about your ass, it likely wasn't a particularly random number. 2) You can't even begin to speculate about the randomness (or lack there-of) of a number without knowing the possible range of outputs for the generating process. 3) A number can be 'random' because it came from a particular process, so it isn't just the process which is 'random'. Random is an attribute of numbers, not just processes. In fact, it can even be an attribute of a _particular_ number. 4) a single sample is a set, it is a set of size 1 All this being said, you can talk about the expected attributes of a random number: for a good RNG, the each bit is independent of each other bit and each bit has a 50% chance of being a 1. For instance, your example (2^3021377 - 1) is a number that requires 3021377 bits to express (at minimum) to express. If we just take those 3021377 bits, we can do an analysis of them. (And with a flick of my wrist, we now have an interesting set of size 3021377) 3021377 bits is more than enough for a reasonable sample, actually. We would find that the sample contained all 1s. This bit sample would fail any test for statistical randomness that I can think of. So, If I were provided only one sample from a RNG that produced 3021377 bit numbers, and this just happened to be the number, I'd say that I'd reasonably be able to say that the RNG was poor. Mar 1 04:11 2001 from bill Josh: I was with you up to your last statement. It feels too much like you would be saying that after flipping a coin 3021376 times and having it come up heads that probability would say that the next time it would *really* have to be tails. Rather than being a 50% chance. I can't believe you would even think that. :-) Mar 1 05:05 2001 from bill Bill: But you have to admit that it isn' t very likely. And it be hard to believe that it was randomly chosen. Although it wouldn't be a good generator if it was within the output range and it *never* came out. Mar 1 07:29 2001 from Josh Bill: errr... ok. No. I'm not saying "boy, those 3021376 bits all being one is odd, but it's the 3021377th bit that tore it". I'm saying that the chances of getting 3021377 '1' bits in a row from a good RNG is 1 in 2^3021377, which is a damn small number, therefor I suspect that it is _not_ a good random number generator. Mar 1 16:41 2001 from bill Josh: I gurss what I am trying to say is that if it were a *perfect* RNG, that number would be *exactly* as likely to occur as one that *appears* to have more entrophy. For example the random number generator isn't based in binary. It isn't a nice clean number any more. I think I should spend more time working with strange theoretical problems like this and infinity. :-) Mar 1 17:53 2001 from Josh You'll note that I did a bitwise analysis of the sample. Though any _particular_ number in my RNG has a 1/2^3021377 chance in being selected, the bits that make up that number have different characteristics. Like you can still expect the sample to generally display certain statistical properties. Because the sample does not does not display these statistical properties, you can draw one of two conclusions: 1) Wow, coincidences _do_ happen, I guess I should choose again or 2) OK, coincidence is one thing, but this particular occurrence is has about the same probability as all the molecules of air just happening to be in the other side of the room, causing spontaneous human decompression (ie: it could happen, but in practice it doesn't). :-) It doesn't help that 'all ones' is a common RNG failure. Mar 3 01:53 2001 from Jesse Mundis Josh, thanks. That's exactly the kind of insight I was looking for. In particular, your points 2 and 3 were of relevance. 2) You can't even begin to speculate about the randomness (or lack there-of) of a number without knowing the possible range of outputs for the generating process. 3) A number can be 'random' because it came from a particular process, so it isn't just the process which is 'random'. Random is an attribute of numbers, not just processes. In fact, it can even be an attribute of a _particular_ number. Ok, point 2 which you made above, was essentially my position. Without knowing the range or context in which the number exists, I claimed you couldn't meaningfully call *that number* random (or non-random). Only with context, or as you put it, the range of outputs for the generating process, can you speculate on randomness as a quality. I misinterpreted that as (and hence asked the original question) "without context, numbers lake the quality of (non ) randomness all together. Your point 3, however, along with you following statistical bit-test example makes a reasonable case for numbers *themselves* to be random. It's a subtle point. Thanks for the clarification. Mar 3 05:02 2001 from Josh It is the difference between possibility and likelihood. Sure, there is a possibility that the RNG just happens to spit out 2^3021377 - 1. In fact, assuming that the RNG generates 3021377 bit numbers, the possibility is 1/2^3021377. Also true, is that any other _particular_ number has the same likelihood to be picked as 2^3021377 - 1 (assuming it is a good RNG). But if you just change your view, you can comment on the number more rigorously. You can break up the number into lots of uniformly sized sections and analyze them. Because of the nature of randomness, you can expect any arbitrarily sized sub-section of the random number to display the same properties as the whole. (this is trivially provable, and is left as an exercise for the reader. LOL. If anyone is interested in why this is, I'll address it separately) I did a quick wave of my hands at the 1 bit sized chunks, but randomness testing is by no means limited to this. At my work, we do a series of statistical tests to attempt to tell if the PRNG in use is terribly, terribly flawed. (often it is... ). We test for a series of bit-based properties, and a nibble-based property. This is a not a particularly thorough test, but it catches lots of people. If I were interested in going beyond this, I'd do similar tests with larger chunks (at least 8 bit, and maybe 16 bit chunks). If I _really_ cared, I'd also do an additional series of tests (which, once again, if you're interested, I can go into...) OK, so what I've been trying to get to is this: just because each number is equally likely to occur, doesn't mean that every attribute of the number is also equally as likely. For instance, you can say: attribute: number of 1s in a 4 bit number. If you examine the probability distribution for this attribute, you'll find that there is exactly 1 case in the 2^4 sized space that has the attribute of all '1's, whereas there are 6 cases that have 2 '1's. So, obviously for a 4 bit number, you are more likely to get a number with 2 '1's than you are to get a number with 4 '1's. But not by a _huge_ amount: P(2 '1's) = 6/16 = .375, whereas P(0 '1's) = 1/16 = .0625. Note for further calculations: The generic way of computing the count of numbers that contain a certain count of '1's is this: for n bit numbers, the count of numbers containing m '1' bits is Combo(n, m) (that is n choose m). Now lets look at the 3021377 bit number. There are Combo(3021377, 1510688) (about 10^909521 or 2^3021363) numbers where the count of '1's is 1510688 (and the same count of numbers where the count of '1's is 1510689). There is 1 number with 3021377 '1's. So, P(1510688 '1's) = 1/2^14 = 6.103E-5, whereas P(3021377, 3021377) = 1 / 2^3021377 = 1E-909525. So, the all '1's set is 10^909519 time less likely than the roughly half 1s set. That's a big number. So, though you can say that every output number is equally likely, you can also say that you can expect the output number to be comprised of roughly half '1' bits. It sounds strange, but it's true.