1:mod:`random` --- Generate pseudo-random numbers
2================================================
3
4.. module:: random
5   :synopsis: Generate pseudo-random numbers with various common distributions.
6
7**Source code:** :source:`Lib/random.py`
8
9--------------
10
11This module implements pseudo-random number generators for various
12distributions.
13
14For integers, there is uniform selection from a range. For sequences, there is
15uniform selection of a random element, a function to generate a random
16permutation of a list in-place, and a function for random sampling without
17replacement.
18
19On the real line, there are functions to compute uniform, normal (Gaussian),
20lognormal, negative exponential, gamma, and beta distributions. For generating
21distributions of angles, the von Mises distribution is available.
22
23Almost all module functions depend on the basic function :func:`.random`, which
24generates a random float uniformly in the semi-open range [0.0, 1.0).  Python
25uses the Mersenne Twister as the core generator.  It produces 53-bit precision
26floats and has a period of 2\*\*19937-1.  The underlying implementation in C is
27both fast and threadsafe.  The Mersenne Twister is one of the most extensively
28tested random number generators in existence.  However, being completely
29deterministic, it is not suitable for all purposes, and is completely unsuitable
30for cryptographic purposes.
31
32The functions supplied by this module are actually bound methods of a hidden
33instance of the :class:`random.Random` class.  You can instantiate your own
34instances of :class:`Random` to get generators that don't share state.
35
36Class :class:`Random` can also be subclassed if you want to use a different
37basic generator of your own devising: in that case, override the :meth:`~Random.random`,
38:meth:`~Random.seed`, :meth:`~Random.getstate`, and :meth:`~Random.setstate` methods.
39Optionally, a new generator can supply a :meth:`~Random.getrandbits` method --- this
40allows :meth:`randrange` to produce selections over an arbitrarily large range.
41
42The :mod:`random` module also provides the :class:`SystemRandom` class which
43uses the system function :func:`os.urandom` to generate random numbers
44from sources provided by the operating system.
45
46.. warning::
47
48   The pseudo-random generators of this module should not be used for
49   security purposes.  For security or cryptographic uses, see the
50   :mod:`secrets` module.
51
52.. seealso::
53
54   M. Matsumoto and T. Nishimura, "Mersenne Twister: A 623-dimensionally
55   equidistributed uniform pseudorandom number generator", ACM Transactions on
56   Modeling and Computer Simulation Vol. 8, No. 1, January pp.3--30 1998.
57
58
59   `Complementary-Multiply-with-Carry recipe
60   <https://code.activestate.com/recipes/576707/>`_ for a compatible alternative
61   random number generator with a long period and comparatively simple update
62   operations.
63
64
65Bookkeeping functions
66---------------------
67
68.. function:: seed(a=None, version=2)
69
70   Initialize the random number generator.
71
72   If *a* is omitted or ``None``, the current system time is used.  If
73   randomness sources are provided by the operating system, they are used
74   instead of the system time (see the :func:`os.urandom` function for details
75   on availability).
76
77   If *a* is an int, it is used directly.
78
79   With version 2 (the default), a :class:`str`, :class:`bytes`, or :class:`bytearray`
80   object gets converted to an :class:`int` and all of its bits are used.
81
82   With version 1 (provided for reproducing random sequences from older versions
83   of Python), the algorithm for :class:`str` and :class:`bytes` generates a
84   narrower range of seeds.
85
86   .. versionchanged:: 3.2
87      Moved to the version 2 scheme which uses all of the bits in a string seed.
88
89   .. deprecated:: 3.9
90      In the future, the *seed* must be one of the following types:
91      *NoneType*, :class:`int`, :class:`float`, :class:`str`,
92      :class:`bytes`, or :class:`bytearray`.
93
94.. function:: getstate()
95
96   Return an object capturing the current internal state of the generator.  This
97   object can be passed to :func:`setstate` to restore the state.
98
99
100.. function:: setstate(state)
101
102   *state* should have been obtained from a previous call to :func:`getstate`, and
103   :func:`setstate` restores the internal state of the generator to what it was at
104   the time :func:`getstate` was called.
105
106
107Functions for bytes
108-------------------
109
110.. function:: randbytes(n)
111
112   Generate *n* random bytes.
113
114   This method should not be used for generating security tokens.
115   Use :func:`secrets.token_bytes` instead.
116
117   .. versionadded:: 3.9
118
119
120Functions for integers
121----------------------
122
123.. function:: randrange(stop)
124              randrange(start, stop[, step])
125
126   Return a randomly selected element from ``range(start, stop, step)``.  This is
127   equivalent to ``choice(range(start, stop, step))``, but doesn't actually build a
128   range object.
129
130   The positional argument pattern matches that of :func:`range`.  Keyword arguments
131   should not be used because the function may use them in unexpected ways.
132
133   .. versionchanged:: 3.2
134      :meth:`randrange` is more sophisticated about producing equally distributed
135      values.  Formerly it used a style like ``int(random()*n)`` which could produce
136      slightly uneven distributions.
137
138.. function:: randint(a, b)
139
140   Return a random integer *N* such that ``a <= N <= b``.  Alias for
141   ``randrange(a, b+1)``.
142
143.. function:: getrandbits(k)
144
145   Returns a Python integer with *k* random bits. This method is supplied with
146   the MersenneTwister generator and some other generators may also provide it
147   as an optional part of the API. When available, :meth:`getrandbits` enables
148   :meth:`randrange` to handle arbitrarily large ranges.
149
150   .. versionchanged:: 3.9
151      This method now accepts zero for *k*.
152
153
154Functions for sequences
155-----------------------
156
157.. function:: choice(seq)
158
159   Return a random element from the non-empty sequence *seq*. If *seq* is empty,
160   raises :exc:`IndexError`.
161
162.. function:: choices(population, weights=None, *, cum_weights=None, k=1)
163
164   Return a *k* sized list of elements chosen from the *population* with replacement.
165   If the *population* is empty, raises :exc:`IndexError`.
166
167   If a *weights* sequence is specified, selections are made according to the
168   relative weights.  Alternatively, if a *cum_weights* sequence is given, the
169   selections are made according to the cumulative weights (perhaps computed
170   using :func:`itertools.accumulate`).  For example, the relative weights
171   ``[10, 5, 30, 5]`` are equivalent to the cumulative weights
172   ``[10, 15, 45, 50]``.  Internally, the relative weights are converted to
173   cumulative weights before making selections, so supplying the cumulative
174   weights saves work.
175
176   If neither *weights* nor *cum_weights* are specified, selections are made
177   with equal probability.  If a weights sequence is supplied, it must be
178   the same length as the *population* sequence.  It is a :exc:`TypeError`
179   to specify both *weights* and *cum_weights*.
180
181   The *weights* or *cum_weights* can use any numeric type that interoperates
182   with the :class:`float` values returned by :func:`random` (that includes
183   integers, floats, and fractions but excludes decimals).  Behavior is
184   undefined if any weight is negative.  A :exc:`ValueError` is raised if all
185   weights are zero.
186
187   For a given seed, the :func:`choices` function with equal weighting
188   typically produces a different sequence than repeated calls to
189   :func:`choice`.  The algorithm used by :func:`choices` uses floating
190   point arithmetic for internal consistency and speed.  The algorithm used
191   by :func:`choice` defaults to integer arithmetic with repeated selections
192   to avoid small biases from round-off error.
193
194   .. versionadded:: 3.6
195
196   .. versionchanged:: 3.9
197      Raises a :exc:`ValueError` if all weights are zero.
198
199
200.. function:: shuffle(x[, random])
201
202   Shuffle the sequence *x* in place.
203
204   The optional argument *random* is a 0-argument function returning a random
205   float in [0.0, 1.0); by default, this is the function :func:`.random`.
206
207   To shuffle an immutable sequence and return a new shuffled list, use
208   ``sample(x, k=len(x))`` instead.
209
210   Note that even for small ``len(x)``, the total number of permutations of *x*
211   can quickly grow larger than the period of most random number generators.
212   This implies that most permutations of a long sequence can never be
213   generated.  For example, a sequence of length 2080 is the largest that
214   can fit within the period of the Mersenne Twister random number generator.
215
216   .. deprecated-removed:: 3.9 3.11
217      The optional parameter *random*.
218
219
220.. function:: sample(population, k, *, counts=None)
221
222   Return a *k* length list of unique elements chosen from the population sequence
223   or set. Used for random sampling without replacement.
224
225   Returns a new list containing elements from the population while leaving the
226   original population unchanged.  The resulting list is in selection order so that
227   all sub-slices will also be valid random samples.  This allows raffle winners
228   (the sample) to be partitioned into grand prize and second place winners (the
229   subslices).
230
231   Members of the population need not be :term:`hashable` or unique.  If the population
232   contains repeats, then each occurrence is a possible selection in the sample.
233
234   Repeated elements can be specified one at a time or with the optional
235   keyword-only *counts* parameter.  For example, ``sample(['red', 'blue'],
236   counts=[4, 2], k=5)`` is equivalent to ``sample(['red', 'red', 'red', 'red',
237   'blue', 'blue'], k=5)``.
238
239   To choose a sample from a range of integers, use a :func:`range` object as an
240   argument.  This is especially fast and space efficient for sampling from a large
241   population:  ``sample(range(10000000), k=60)``.
242
243   If the sample size is larger than the population size, a :exc:`ValueError`
244   is raised.
245
246   .. versionchanged:: 3.9
247      Added the *counts* parameter.
248
249   .. deprecated:: 3.9
250      In the future, the *population* must be a sequence.  Instances of
251      :class:`set` are no longer supported.  The set must first be converted
252      to a :class:`list` or :class:`tuple`, preferably in a deterministic
253      order so that the sample is reproducible.
254
255
256.. _real-valued-distributions:
257
258Real-valued distributions
259-------------------------
260
261The following functions generate specific real-valued distributions. Function
262parameters are named after the corresponding variables in the distribution's
263equation, as used in common mathematical practice; most of these equations can
264be found in any statistics text.
265
266
267.. function:: random()
268
269   Return the next random floating point number in the range [0.0, 1.0).
270
271
272.. function:: uniform(a, b)
273
274   Return a random floating point number *N* such that ``a <= N <= b`` for
275   ``a <= b`` and ``b <= N <= a`` for ``b < a``.
276
277   The end-point value ``b`` may or may not be included in the range
278   depending on floating-point rounding in the equation ``a + (b-a) * random()``.
279
280
281.. function:: triangular(low, high, mode)
282
283   Return a random floating point number *N* such that ``low <= N <= high`` and
284   with the specified *mode* between those bounds.  The *low* and *high* bounds
285   default to zero and one.  The *mode* argument defaults to the midpoint
286   between the bounds, giving a symmetric distribution.
287
288
289.. function:: betavariate(alpha, beta)
290
291   Beta distribution.  Conditions on the parameters are ``alpha > 0`` and
292   ``beta > 0``. Returned values range between 0 and 1.
293
294
295.. function:: expovariate(lambd)
296
297   Exponential distribution.  *lambd* is 1.0 divided by the desired
298   mean.  It should be nonzero.  (The parameter would be called
299   "lambda", but that is a reserved word in Python.)  Returned values
300   range from 0 to positive infinity if *lambd* is positive, and from
301   negative infinity to 0 if *lambd* is negative.
302
303
304.. function:: gammavariate(alpha, beta)
305
306   Gamma distribution.  (*Not* the gamma function!)  Conditions on the
307   parameters are ``alpha > 0`` and ``beta > 0``.
308
309   The probability distribution function is::
310
311                 x ** (alpha - 1) * math.exp(-x / beta)
312       pdf(x) =  --------------------------------------
313                   math.gamma(alpha) * beta ** alpha
314
315
316.. function:: gauss(mu, sigma)
317
318   Gaussian distribution.  *mu* is the mean, and *sigma* is the standard
319   deviation.  This is slightly faster than the :func:`normalvariate` function
320   defined below.
321
322   Multithreading note:  When two threads call this function
323   simultaneously, it is possible that they will receive the
324   same return value.  This can be avoided in three ways.
325   1) Have each thread use a different instance of the random
326   number generator. 2) Put locks around all calls. 3) Use the
327   slower, but thread-safe :func:`normalvariate` function instead.
328
329
330.. function:: lognormvariate(mu, sigma)
331
332   Log normal distribution.  If you take the natural logarithm of this
333   distribution, you'll get a normal distribution with mean *mu* and standard
334   deviation *sigma*.  *mu* can have any value, and *sigma* must be greater than
335   zero.
336
337
338.. function:: normalvariate(mu, sigma)
339
340   Normal distribution.  *mu* is the mean, and *sigma* is the standard deviation.
341
342
343.. function:: vonmisesvariate(mu, kappa)
344
345   *mu* is the mean angle, expressed in radians between 0 and 2\*\ *pi*, and *kappa*
346   is the concentration parameter, which must be greater than or equal to zero.  If
347   *kappa* is equal to zero, this distribution reduces to a uniform random angle
348   over the range 0 to 2\*\ *pi*.
349
350
351.. function:: paretovariate(alpha)
352
353   Pareto distribution.  *alpha* is the shape parameter.
354
355
356.. function:: weibullvariate(alpha, beta)
357
358   Weibull distribution.  *alpha* is the scale parameter and *beta* is the shape
359   parameter.
360
361
362Alternative Generator
363---------------------
364
365.. class:: Random([seed])
366
367   Class that implements the default pseudo-random number generator used by the
368   :mod:`random` module.
369
370   .. deprecated:: 3.9
371      In the future, the *seed* must be one of the following types:
372      :class:`NoneType`, :class:`int`, :class:`float`, :class:`str`,
373      :class:`bytes`, or :class:`bytearray`.
374
375.. class:: SystemRandom([seed])
376
377   Class that uses the :func:`os.urandom` function for generating random numbers
378   from sources provided by the operating system. Not available on all systems.
379   Does not rely on software state, and sequences are not reproducible. Accordingly,
380   the :meth:`seed` method has no effect and is ignored.
381   The :meth:`getstate` and :meth:`setstate` methods raise
382   :exc:`NotImplementedError` if called.
383
384
385Notes on Reproducibility
386------------------------
387
388Sometimes it is useful to be able to reproduce the sequences given by a
389pseudo-random number generator.  By re-using a seed value, the same sequence should be
390reproducible from run to run as long as multiple threads are not running.
391
392Most of the random module's algorithms and seeding functions are subject to
393change across Python versions, but two aspects are guaranteed not to change:
394
395* If a new seeding method is added, then a backward compatible seeder will be
396  offered.
397
398* The generator's :meth:`~Random.random` method will continue to produce the same
399  sequence when the compatible seeder is given the same seed.
400
401.. _random-examples:
402
403Examples
404--------
405
406Basic examples::
407
408   >>> random()                             # Random float:  0.0 <= x < 1.0
409   0.37444887175646646
410
411   >>> uniform(2.5, 10.0)                   # Random float:  2.5 <= x < 10.0
412   3.1800146073117523
413
414   >>> expovariate(1 / 5)                   # Interval between arrivals averaging 5 seconds
415   5.148957571865031
416
417   >>> randrange(10)                        # Integer from 0 to 9 inclusive
418   7
419
420   >>> randrange(0, 101, 2)                 # Even integer from 0 to 100 inclusive
421   26
422
423   >>> choice(['win', 'lose', 'draw'])      # Single random element from a sequence
424   'draw'
425
426   >>> deck = 'ace two three four'.split()
427   >>> shuffle(deck)                        # Shuffle a list
428   >>> deck
429   ['four', 'two', 'ace', 'three']
430
431   >>> sample([10, 20, 30, 40, 50], k=4)    # Four samples without replacement
432   [40, 10, 50, 30]
433
434Simulations::
435
436   >>> # Six roulette wheel spins (weighted sampling with replacement)
437   >>> choices(['red', 'black', 'green'], [18, 18, 2], k=6)
438   ['red', 'green', 'black', 'black', 'red', 'black']
439
440   >>> # Deal 20 cards without replacement from a deck
441   >>> # of 52 playing cards, and determine the proportion of cards
442   >>> # with a ten-value:  ten, jack, queen, or king.
443   >>> dealt = sample(['tens', 'low cards'], counts=[16, 36], k=20)
444   >>> dealt.count('tens') / 20
445   0.15
446
447   >>> # Estimate the probability of getting 5 or more heads from 7 spins
448   >>> # of a biased coin that settles on heads 60% of the time.
449   >>> def trial():
450   ...     return choices('HT', cum_weights=(0.60, 1.00), k=7).count('H') >= 5
451   ...
452   >>> sum(trial() for i in range(10_000)) / 10_000
453   0.4169
454
455   >>> # Probability of the median of 5 samples being in middle two quartiles
456   >>> def trial():
457   ...     return 2_500 <= sorted(choices(range(10_000), k=5))[2] < 7_500
458   ...
459   >>> sum(trial() for i in range(10_000)) / 10_000
460   0.7958
461
462Example of `statistical bootstrapping
463<https://en.wikipedia.org/wiki/Bootstrapping_(statistics)>`_ using resampling
464with replacement to estimate a confidence interval for the mean of a sample::
465
466   # http://statistics.about.com/od/Applications/a/Example-Of-Bootstrapping.htm
467   from statistics import fmean as mean
468   from random import choices
469
470   data = [41, 50, 29, 37, 81, 30, 73, 63, 20, 35, 68, 22, 60, 31, 95]
471   means = sorted(mean(choices(data, k=len(data))) for i in range(100))
472   print(f'The sample mean of {mean(data):.1f} has a 90% confidence '
473         f'interval from {means[5]:.1f} to {means[94]:.1f}')
474
475Example of a `resampling permutation test
476<https://en.wikipedia.org/wiki/Resampling_(statistics)#Permutation_tests>`_
477to determine the statistical significance or `p-value
478<https://en.wikipedia.org/wiki/P-value>`_ of an observed difference
479between the effects of a drug versus a placebo::
480
481    # Example from "Statistics is Easy" by Dennis Shasha and Manda Wilson
482    from statistics import fmean as mean
483    from random import shuffle
484
485    drug = [54, 73, 53, 70, 73, 68, 52, 65, 65]
486    placebo = [54, 51, 58, 44, 55, 52, 42, 47, 58, 46]
487    observed_diff = mean(drug) - mean(placebo)
488
489    n = 10_000
490    count = 0
491    combined = drug + placebo
492    for i in range(n):
493        shuffle(combined)
494        new_diff = mean(combined[:len(drug)]) - mean(combined[len(drug):])
495        count += (new_diff >= observed_diff)
496
497    print(f'{n} label reshufflings produced only {count} instances with a difference')
498    print(f'at least as extreme as the observed difference of {observed_diff:.1f}.')
499    print(f'The one-sided p-value of {count / n:.4f} leads us to reject the null')
500    print(f'hypothesis that there is no difference between the drug and the placebo.')
501
502Simulation of arrival times and service deliveries for a multiserver queue::
503
504    from heapq import heappush, heappop
505    from random import expovariate, gauss
506    from statistics import mean, median, stdev
507
508    average_arrival_interval = 5.6
509    average_service_time = 15.0
510    stdev_service_time = 3.5
511    num_servers = 3
512
513    waits = []
514    arrival_time = 0.0
515    servers = [0.0] * num_servers  # time when each server becomes available
516    for i in range(100_000):
517        arrival_time += expovariate(1.0 / average_arrival_interval)
518        next_server_available = heappop(servers)
519        wait = max(0.0, next_server_available - arrival_time)
520        waits.append(wait)
521        service_duration = gauss(average_service_time, stdev_service_time)
522        service_completed = arrival_time + wait + service_duration
523        heappush(servers, service_completed)
524
525    print(f'Mean wait: {mean(waits):.1f}.  Stdev wait: {stdev(waits):.1f}.')
526    print(f'Median wait: {median(waits):.1f}.  Max wait: {max(waits):.1f}.')
527
528.. seealso::
529
530   `Statistics for Hackers <https://www.youtube.com/watch?v=Iq9DzN6mvYA>`_
531   a video tutorial by
532   `Jake Vanderplas <https://us.pycon.org/2016/speaker/profile/295/>`_
533   on statistical analysis using just a few fundamental concepts
534   including simulation, sampling, shuffling, and cross-validation.
535
536   `Economics Simulation
537   <http://nbviewer.jupyter.org/url/norvig.com/ipython/Economics.ipynb>`_
538   a simulation of a marketplace by
539   `Peter Norvig <http://norvig.com/bio.html>`_ that shows effective
540   use of many of the tools and distributions provided by this module
541   (gauss, uniform, sample, betavariate, choice, triangular, and randrange).
542
543   `A Concrete Introduction to Probability (using Python)
544   <http://nbviewer.jupyter.org/url/norvig.com/ipython/Probability.ipynb>`_
545   a tutorial by `Peter Norvig <http://norvig.com/bio.html>`_ covering
546   the basics of probability theory, how to write simulations, and
547   how to perform data analysis using Python.
548
549
550Recipes
551-------
552
553The default :func:`.random` returns multiples of 2⁻⁵³ in the range
554*0.0 ≤ x < 1.0*.  All such numbers are evenly spaced and are exactly
555representable as Python floats.  However, many other representable
556floats in that interval are not possible selections.  For example,
557``0.05954861408025609`` isn't an integer multiple of 2⁻⁵³.
558
559The following recipe takes a different approach.  All floats in the
560interval are possible selections.  The mantissa comes from a uniform
561distribution of integers in the range *2⁵² ≤ mantissa < 2⁵³*.  The
562exponent comes from a geometric distribution where exponents smaller
563than *-53* occur half as often as the next larger exponent.
564
565::
566
567    from random import Random
568    from math import ldexp
569
570    class FullRandom(Random):
571
572        def random(self):
573            mantissa = 0x10_0000_0000_0000 | self.getrandbits(52)
574            exponent = -53
575            x = 0
576            while not x:
577                x = self.getrandbits(32)
578                exponent += x.bit_length() - 32
579            return ldexp(mantissa, exponent)
580
581All :ref:`real valued distributions <real-valued-distributions>`
582in the class will use the new method::
583
584    >>> fr = FullRandom()
585    >>> fr.random()
586    0.05954861408025609
587    >>> fr.expovariate(0.25)
588    8.87925541791544
589
590The recipe is conceptually equivalent to an algorithm that chooses from
591all the multiples of 2⁻¹⁰⁷⁴ in the range *0.0 ≤ x < 1.0*.  All such
592numbers are evenly spaced, but most have to be rounded down to the
593nearest representable Python float.  (The value 2⁻¹⁰⁷⁴ is the smallest
594positive unnormalized float and is equal to ``math.ulp(0.0)``.)
595
596
597.. seealso::
598
599   `Generating Pseudo-random Floating-Point Values
600   <https://allendowney.com/research/rand/downey07randfloat.pdf>`_ a
601   paper by Allen B. Downey describing ways to generate more
602   fine-grained floats than normally generated by :func:`.random`.
603