1:mod:`random` --- Generate pseudo-random numbers 2================================================ 3 4.. module:: random 5 :synopsis: Generate pseudo-random numbers with various common distributions. 6 7**Source code:** :source:`Lib/random.py` 8 9-------------- 10 11This module implements pseudo-random number generators for various 12distributions. 13 14For integers, there is uniform selection from a range. For sequences, there is 15uniform selection of a random element, a function to generate a random 16permutation of a list in-place, and a function for random sampling without 17replacement. 18 19On the real line, there are functions to compute uniform, normal (Gaussian), 20lognormal, negative exponential, gamma, and beta distributions. For generating 21distributions of angles, the von Mises distribution is available. 22 23Almost all module functions depend on the basic function :func:`.random`, which 24generates a random float uniformly in the semi-open range [0.0, 1.0). Python 25uses the Mersenne Twister as the core generator. It produces 53-bit precision 26floats and has a period of 2\*\*19937-1. The underlying implementation in C is 27both fast and threadsafe. The Mersenne Twister is one of the most extensively 28tested random number generators in existence. However, being completely 29deterministic, it is not suitable for all purposes, and is completely unsuitable 30for cryptographic purposes. 31 32The functions supplied by this module are actually bound methods of a hidden 33instance of the :class:`random.Random` class. You can instantiate your own 34instances of :class:`Random` to get generators that don't share state. 35 36Class :class:`Random` can also be subclassed if you want to use a different 37basic generator of your own devising: in that case, override the :meth:`~Random.random`, 38:meth:`~Random.seed`, :meth:`~Random.getstate`, and :meth:`~Random.setstate` methods. 39Optionally, a new generator can supply a :meth:`~Random.getrandbits` method --- this 40allows :meth:`randrange` to produce selections over an arbitrarily large range. 41 42The :mod:`random` module also provides the :class:`SystemRandom` class which 43uses the system function :func:`os.urandom` to generate random numbers 44from sources provided by the operating system. 45 46.. warning:: 47 48 The pseudo-random generators of this module should not be used for 49 security purposes. For security or cryptographic uses, see the 50 :mod:`secrets` module. 51 52.. seealso:: 53 54 M. Matsumoto and T. Nishimura, "Mersenne Twister: A 623-dimensionally 55 equidistributed uniform pseudorandom number generator", ACM Transactions on 56 Modeling and Computer Simulation Vol. 8, No. 1, January pp.3--30 1998. 57 58 59 `Complementary-Multiply-with-Carry recipe 60 <https://code.activestate.com/recipes/576707/>`_ for a compatible alternative 61 random number generator with a long period and comparatively simple update 62 operations. 63 64 65Bookkeeping functions 66--------------------- 67 68.. function:: seed(a=None, version=2) 69 70 Initialize the random number generator. 71 72 If *a* is omitted or ``None``, the current system time is used. If 73 randomness sources are provided by the operating system, they are used 74 instead of the system time (see the :func:`os.urandom` function for details 75 on availability). 76 77 If *a* is an int, it is used directly. 78 79 With version 2 (the default), a :class:`str`, :class:`bytes`, or :class:`bytearray` 80 object gets converted to an :class:`int` and all of its bits are used. 81 82 With version 1 (provided for reproducing random sequences from older versions 83 of Python), the algorithm for :class:`str` and :class:`bytes` generates a 84 narrower range of seeds. 85 86 .. versionchanged:: 3.2 87 Moved to the version 2 scheme which uses all of the bits in a string seed. 88 89 .. deprecated:: 3.9 90 In the future, the *seed* must be one of the following types: 91 *NoneType*, :class:`int`, :class:`float`, :class:`str`, 92 :class:`bytes`, or :class:`bytearray`. 93 94.. function:: getstate() 95 96 Return an object capturing the current internal state of the generator. This 97 object can be passed to :func:`setstate` to restore the state. 98 99 100.. function:: setstate(state) 101 102 *state* should have been obtained from a previous call to :func:`getstate`, and 103 :func:`setstate` restores the internal state of the generator to what it was at 104 the time :func:`getstate` was called. 105 106 107Functions for bytes 108------------------- 109 110.. function:: randbytes(n) 111 112 Generate *n* random bytes. 113 114 This method should not be used for generating security tokens. 115 Use :func:`secrets.token_bytes` instead. 116 117 .. versionadded:: 3.9 118 119 120Functions for integers 121---------------------- 122 123.. function:: randrange(stop) 124 randrange(start, stop[, step]) 125 126 Return a randomly selected element from ``range(start, stop, step)``. This is 127 equivalent to ``choice(range(start, stop, step))``, but doesn't actually build a 128 range object. 129 130 The positional argument pattern matches that of :func:`range`. Keyword arguments 131 should not be used because the function may use them in unexpected ways. 132 133 .. versionchanged:: 3.2 134 :meth:`randrange` is more sophisticated about producing equally distributed 135 values. Formerly it used a style like ``int(random()*n)`` which could produce 136 slightly uneven distributions. 137 138.. function:: randint(a, b) 139 140 Return a random integer *N* such that ``a <= N <= b``. Alias for 141 ``randrange(a, b+1)``. 142 143.. function:: getrandbits(k) 144 145 Returns a Python integer with *k* random bits. This method is supplied with 146 the MersenneTwister generator and some other generators may also provide it 147 as an optional part of the API. When available, :meth:`getrandbits` enables 148 :meth:`randrange` to handle arbitrarily large ranges. 149 150 .. versionchanged:: 3.9 151 This method now accepts zero for *k*. 152 153 154Functions for sequences 155----------------------- 156 157.. function:: choice(seq) 158 159 Return a random element from the non-empty sequence *seq*. If *seq* is empty, 160 raises :exc:`IndexError`. 161 162.. function:: choices(population, weights=None, *, cum_weights=None, k=1) 163 164 Return a *k* sized list of elements chosen from the *population* with replacement. 165 If the *population* is empty, raises :exc:`IndexError`. 166 167 If a *weights* sequence is specified, selections are made according to the 168 relative weights. Alternatively, if a *cum_weights* sequence is given, the 169 selections are made according to the cumulative weights (perhaps computed 170 using :func:`itertools.accumulate`). For example, the relative weights 171 ``[10, 5, 30, 5]`` are equivalent to the cumulative weights 172 ``[10, 15, 45, 50]``. Internally, the relative weights are converted to 173 cumulative weights before making selections, so supplying the cumulative 174 weights saves work. 175 176 If neither *weights* nor *cum_weights* are specified, selections are made 177 with equal probability. If a weights sequence is supplied, it must be 178 the same length as the *population* sequence. It is a :exc:`TypeError` 179 to specify both *weights* and *cum_weights*. 180 181 The *weights* or *cum_weights* can use any numeric type that interoperates 182 with the :class:`float` values returned by :func:`random` (that includes 183 integers, floats, and fractions but excludes decimals). Behavior is 184 undefined if any weight is negative. A :exc:`ValueError` is raised if all 185 weights are zero. 186 187 For a given seed, the :func:`choices` function with equal weighting 188 typically produces a different sequence than repeated calls to 189 :func:`choice`. The algorithm used by :func:`choices` uses floating 190 point arithmetic for internal consistency and speed. The algorithm used 191 by :func:`choice` defaults to integer arithmetic with repeated selections 192 to avoid small biases from round-off error. 193 194 .. versionadded:: 3.6 195 196 .. versionchanged:: 3.9 197 Raises a :exc:`ValueError` if all weights are zero. 198 199 200.. function:: shuffle(x[, random]) 201 202 Shuffle the sequence *x* in place. 203 204 The optional argument *random* is a 0-argument function returning a random 205 float in [0.0, 1.0); by default, this is the function :func:`.random`. 206 207 To shuffle an immutable sequence and return a new shuffled list, use 208 ``sample(x, k=len(x))`` instead. 209 210 Note that even for small ``len(x)``, the total number of permutations of *x* 211 can quickly grow larger than the period of most random number generators. 212 This implies that most permutations of a long sequence can never be 213 generated. For example, a sequence of length 2080 is the largest that 214 can fit within the period of the Mersenne Twister random number generator. 215 216 .. deprecated-removed:: 3.9 3.11 217 The optional parameter *random*. 218 219 220.. function:: sample(population, k, *, counts=None) 221 222 Return a *k* length list of unique elements chosen from the population sequence 223 or set. Used for random sampling without replacement. 224 225 Returns a new list containing elements from the population while leaving the 226 original population unchanged. The resulting list is in selection order so that 227 all sub-slices will also be valid random samples. This allows raffle winners 228 (the sample) to be partitioned into grand prize and second place winners (the 229 subslices). 230 231 Members of the population need not be :term:`hashable` or unique. If the population 232 contains repeats, then each occurrence is a possible selection in the sample. 233 234 Repeated elements can be specified one at a time or with the optional 235 keyword-only *counts* parameter. For example, ``sample(['red', 'blue'], 236 counts=[4, 2], k=5)`` is equivalent to ``sample(['red', 'red', 'red', 'red', 237 'blue', 'blue'], k=5)``. 238 239 To choose a sample from a range of integers, use a :func:`range` object as an 240 argument. This is especially fast and space efficient for sampling from a large 241 population: ``sample(range(10000000), k=60)``. 242 243 If the sample size is larger than the population size, a :exc:`ValueError` 244 is raised. 245 246 .. versionchanged:: 3.9 247 Added the *counts* parameter. 248 249 .. deprecated:: 3.9 250 In the future, the *population* must be a sequence. Instances of 251 :class:`set` are no longer supported. The set must first be converted 252 to a :class:`list` or :class:`tuple`, preferably in a deterministic 253 order so that the sample is reproducible. 254 255 256.. _real-valued-distributions: 257 258Real-valued distributions 259------------------------- 260 261The following functions generate specific real-valued distributions. Function 262parameters are named after the corresponding variables in the distribution's 263equation, as used in common mathematical practice; most of these equations can 264be found in any statistics text. 265 266 267.. function:: random() 268 269 Return the next random floating point number in the range [0.0, 1.0). 270 271 272.. function:: uniform(a, b) 273 274 Return a random floating point number *N* such that ``a <= N <= b`` for 275 ``a <= b`` and ``b <= N <= a`` for ``b < a``. 276 277 The end-point value ``b`` may or may not be included in the range 278 depending on floating-point rounding in the equation ``a + (b-a) * random()``. 279 280 281.. function:: triangular(low, high, mode) 282 283 Return a random floating point number *N* such that ``low <= N <= high`` and 284 with the specified *mode* between those bounds. The *low* and *high* bounds 285 default to zero and one. The *mode* argument defaults to the midpoint 286 between the bounds, giving a symmetric distribution. 287 288 289.. function:: betavariate(alpha, beta) 290 291 Beta distribution. Conditions on the parameters are ``alpha > 0`` and 292 ``beta > 0``. Returned values range between 0 and 1. 293 294 295.. function:: expovariate(lambd) 296 297 Exponential distribution. *lambd* is 1.0 divided by the desired 298 mean. It should be nonzero. (The parameter would be called 299 "lambda", but that is a reserved word in Python.) Returned values 300 range from 0 to positive infinity if *lambd* is positive, and from 301 negative infinity to 0 if *lambd* is negative. 302 303 304.. function:: gammavariate(alpha, beta) 305 306 Gamma distribution. (*Not* the gamma function!) Conditions on the 307 parameters are ``alpha > 0`` and ``beta > 0``. 308 309 The probability distribution function is:: 310 311 x ** (alpha - 1) * math.exp(-x / beta) 312 pdf(x) = -------------------------------------- 313 math.gamma(alpha) * beta ** alpha 314 315 316.. function:: gauss(mu, sigma) 317 318 Gaussian distribution. *mu* is the mean, and *sigma* is the standard 319 deviation. This is slightly faster than the :func:`normalvariate` function 320 defined below. 321 322 Multithreading note: When two threads call this function 323 simultaneously, it is possible that they will receive the 324 same return value. This can be avoided in three ways. 325 1) Have each thread use a different instance of the random 326 number generator. 2) Put locks around all calls. 3) Use the 327 slower, but thread-safe :func:`normalvariate` function instead. 328 329 330.. function:: lognormvariate(mu, sigma) 331 332 Log normal distribution. If you take the natural logarithm of this 333 distribution, you'll get a normal distribution with mean *mu* and standard 334 deviation *sigma*. *mu* can have any value, and *sigma* must be greater than 335 zero. 336 337 338.. function:: normalvariate(mu, sigma) 339 340 Normal distribution. *mu* is the mean, and *sigma* is the standard deviation. 341 342 343.. function:: vonmisesvariate(mu, kappa) 344 345 *mu* is the mean angle, expressed in radians between 0 and 2\*\ *pi*, and *kappa* 346 is the concentration parameter, which must be greater than or equal to zero. If 347 *kappa* is equal to zero, this distribution reduces to a uniform random angle 348 over the range 0 to 2\*\ *pi*. 349 350 351.. function:: paretovariate(alpha) 352 353 Pareto distribution. *alpha* is the shape parameter. 354 355 356.. function:: weibullvariate(alpha, beta) 357 358 Weibull distribution. *alpha* is the scale parameter and *beta* is the shape 359 parameter. 360 361 362Alternative Generator 363--------------------- 364 365.. class:: Random([seed]) 366 367 Class that implements the default pseudo-random number generator used by the 368 :mod:`random` module. 369 370 .. deprecated:: 3.9 371 In the future, the *seed* must be one of the following types: 372 :class:`NoneType`, :class:`int`, :class:`float`, :class:`str`, 373 :class:`bytes`, or :class:`bytearray`. 374 375.. class:: SystemRandom([seed]) 376 377 Class that uses the :func:`os.urandom` function for generating random numbers 378 from sources provided by the operating system. Not available on all systems. 379 Does not rely on software state, and sequences are not reproducible. Accordingly, 380 the :meth:`seed` method has no effect and is ignored. 381 The :meth:`getstate` and :meth:`setstate` methods raise 382 :exc:`NotImplementedError` if called. 383 384 385Notes on Reproducibility 386------------------------ 387 388Sometimes it is useful to be able to reproduce the sequences given by a 389pseudo-random number generator. By re-using a seed value, the same sequence should be 390reproducible from run to run as long as multiple threads are not running. 391 392Most of the random module's algorithms and seeding functions are subject to 393change across Python versions, but two aspects are guaranteed not to change: 394 395* If a new seeding method is added, then a backward compatible seeder will be 396 offered. 397 398* The generator's :meth:`~Random.random` method will continue to produce the same 399 sequence when the compatible seeder is given the same seed. 400 401.. _random-examples: 402 403Examples 404-------- 405 406Basic examples:: 407 408 >>> random() # Random float: 0.0 <= x < 1.0 409 0.37444887175646646 410 411 >>> uniform(2.5, 10.0) # Random float: 2.5 <= x < 10.0 412 3.1800146073117523 413 414 >>> expovariate(1 / 5) # Interval between arrivals averaging 5 seconds 415 5.148957571865031 416 417 >>> randrange(10) # Integer from 0 to 9 inclusive 418 7 419 420 >>> randrange(0, 101, 2) # Even integer from 0 to 100 inclusive 421 26 422 423 >>> choice(['win', 'lose', 'draw']) # Single random element from a sequence 424 'draw' 425 426 >>> deck = 'ace two three four'.split() 427 >>> shuffle(deck) # Shuffle a list 428 >>> deck 429 ['four', 'two', 'ace', 'three'] 430 431 >>> sample([10, 20, 30, 40, 50], k=4) # Four samples without replacement 432 [40, 10, 50, 30] 433 434Simulations:: 435 436 >>> # Six roulette wheel spins (weighted sampling with replacement) 437 >>> choices(['red', 'black', 'green'], [18, 18, 2], k=6) 438 ['red', 'green', 'black', 'black', 'red', 'black'] 439 440 >>> # Deal 20 cards without replacement from a deck 441 >>> # of 52 playing cards, and determine the proportion of cards 442 >>> # with a ten-value: ten, jack, queen, or king. 443 >>> dealt = sample(['tens', 'low cards'], counts=[16, 36], k=20) 444 >>> dealt.count('tens') / 20 445 0.15 446 447 >>> # Estimate the probability of getting 5 or more heads from 7 spins 448 >>> # of a biased coin that settles on heads 60% of the time. 449 >>> def trial(): 450 ... return choices('HT', cum_weights=(0.60, 1.00), k=7).count('H') >= 5 451 ... 452 >>> sum(trial() for i in range(10_000)) / 10_000 453 0.4169 454 455 >>> # Probability of the median of 5 samples being in middle two quartiles 456 >>> def trial(): 457 ... return 2_500 <= sorted(choices(range(10_000), k=5))[2] < 7_500 458 ... 459 >>> sum(trial() for i in range(10_000)) / 10_000 460 0.7958 461 462Example of `statistical bootstrapping 463<https://en.wikipedia.org/wiki/Bootstrapping_(statistics)>`_ using resampling 464with replacement to estimate a confidence interval for the mean of a sample:: 465 466 # http://statistics.about.com/od/Applications/a/Example-Of-Bootstrapping.htm 467 from statistics import fmean as mean 468 from random import choices 469 470 data = [41, 50, 29, 37, 81, 30, 73, 63, 20, 35, 68, 22, 60, 31, 95] 471 means = sorted(mean(choices(data, k=len(data))) for i in range(100)) 472 print(f'The sample mean of {mean(data):.1f} has a 90% confidence ' 473 f'interval from {means[5]:.1f} to {means[94]:.1f}') 474 475Example of a `resampling permutation test 476<https://en.wikipedia.org/wiki/Resampling_(statistics)#Permutation_tests>`_ 477to determine the statistical significance or `p-value 478<https://en.wikipedia.org/wiki/P-value>`_ of an observed difference 479between the effects of a drug versus a placebo:: 480 481 # Example from "Statistics is Easy" by Dennis Shasha and Manda Wilson 482 from statistics import fmean as mean 483 from random import shuffle 484 485 drug = [54, 73, 53, 70, 73, 68, 52, 65, 65] 486 placebo = [54, 51, 58, 44, 55, 52, 42, 47, 58, 46] 487 observed_diff = mean(drug) - mean(placebo) 488 489 n = 10_000 490 count = 0 491 combined = drug + placebo 492 for i in range(n): 493 shuffle(combined) 494 new_diff = mean(combined[:len(drug)]) - mean(combined[len(drug):]) 495 count += (new_diff >= observed_diff) 496 497 print(f'{n} label reshufflings produced only {count} instances with a difference') 498 print(f'at least as extreme as the observed difference of {observed_diff:.1f}.') 499 print(f'The one-sided p-value of {count / n:.4f} leads us to reject the null') 500 print(f'hypothesis that there is no difference between the drug and the placebo.') 501 502Simulation of arrival times and service deliveries for a multiserver queue:: 503 504 from heapq import heappush, heappop 505 from random import expovariate, gauss 506 from statistics import mean, median, stdev 507 508 average_arrival_interval = 5.6 509 average_service_time = 15.0 510 stdev_service_time = 3.5 511 num_servers = 3 512 513 waits = [] 514 arrival_time = 0.0 515 servers = [0.0] * num_servers # time when each server becomes available 516 for i in range(100_000): 517 arrival_time += expovariate(1.0 / average_arrival_interval) 518 next_server_available = heappop(servers) 519 wait = max(0.0, next_server_available - arrival_time) 520 waits.append(wait) 521 service_duration = gauss(average_service_time, stdev_service_time) 522 service_completed = arrival_time + wait + service_duration 523 heappush(servers, service_completed) 524 525 print(f'Mean wait: {mean(waits):.1f}. Stdev wait: {stdev(waits):.1f}.') 526 print(f'Median wait: {median(waits):.1f}. Max wait: {max(waits):.1f}.') 527 528.. seealso:: 529 530 `Statistics for Hackers <https://www.youtube.com/watch?v=Iq9DzN6mvYA>`_ 531 a video tutorial by 532 `Jake Vanderplas <https://us.pycon.org/2016/speaker/profile/295/>`_ 533 on statistical analysis using just a few fundamental concepts 534 including simulation, sampling, shuffling, and cross-validation. 535 536 `Economics Simulation 537 <http://nbviewer.jupyter.org/url/norvig.com/ipython/Economics.ipynb>`_ 538 a simulation of a marketplace by 539 `Peter Norvig <http://norvig.com/bio.html>`_ that shows effective 540 use of many of the tools and distributions provided by this module 541 (gauss, uniform, sample, betavariate, choice, triangular, and randrange). 542 543 `A Concrete Introduction to Probability (using Python) 544 <http://nbviewer.jupyter.org/url/norvig.com/ipython/Probability.ipynb>`_ 545 a tutorial by `Peter Norvig <http://norvig.com/bio.html>`_ covering 546 the basics of probability theory, how to write simulations, and 547 how to perform data analysis using Python. 548 549 550Recipes 551------- 552 553The default :func:`.random` returns multiples of 2⁻⁵³ in the range 554*0.0 ≤ x < 1.0*. All such numbers are evenly spaced and are exactly 555representable as Python floats. However, many other representable 556floats in that interval are not possible selections. For example, 557``0.05954861408025609`` isn't an integer multiple of 2⁻⁵³. 558 559The following recipe takes a different approach. All floats in the 560interval are possible selections. The mantissa comes from a uniform 561distribution of integers in the range *2⁵² ≤ mantissa < 2⁵³*. The 562exponent comes from a geometric distribution where exponents smaller 563than *-53* occur half as often as the next larger exponent. 564 565:: 566 567 from random import Random 568 from math import ldexp 569 570 class FullRandom(Random): 571 572 def random(self): 573 mantissa = 0x10_0000_0000_0000 | self.getrandbits(52) 574 exponent = -53 575 x = 0 576 while not x: 577 x = self.getrandbits(32) 578 exponent += x.bit_length() - 32 579 return ldexp(mantissa, exponent) 580 581All :ref:`real valued distributions <real-valued-distributions>` 582in the class will use the new method:: 583 584 >>> fr = FullRandom() 585 >>> fr.random() 586 0.05954861408025609 587 >>> fr.expovariate(0.25) 588 8.87925541791544 589 590The recipe is conceptually equivalent to an algorithm that chooses from 591all the multiples of 2⁻¹⁰⁷⁴ in the range *0.0 ≤ x < 1.0*. All such 592numbers are evenly spaced, but most have to be rounded down to the 593nearest representable Python float. (The value 2⁻¹⁰⁷⁴ is the smallest 594positive unnormalized float and is equal to ``math.ulp(0.0)``.) 595 596 597.. seealso:: 598 599 `Generating Pseudo-random Floating-Point Values 600 <https://allendowney.com/research/rand/downey07randfloat.pdf>`_ a 601 paper by Allen B. Downey describing ways to generate more 602 fine-grained floats than normally generated by :func:`.random`. 603