1.. _tut-informal:
2
3**********************************
4An Informal Introduction to Python
5**********************************
6
7In the following examples, input and output are distinguished by the presence or
8absence of prompts (:term:`>>>` and :term:`...`): to repeat the example, you must type
9everything after the prompt, when the prompt appears; lines that do not begin
10with a prompt are output from the interpreter. Note that a secondary prompt on a
11line by itself in an example means you must type a blank line; this is used to
12end a multi-line command.
13
14Many of the examples in this manual, even those entered at the interactive
15prompt, include comments.  Comments in Python start with the hash character,
16``#``, and extend to the end of the physical line.  A comment may appear at the
17start of a line or following whitespace or code, but not within a string
18literal.  A hash character within a string literal is just a hash character.
19Since comments are to clarify code and are not interpreted by Python, they may
20be omitted when typing in examples.
21
22Some examples::
23
24   # this is the first comment
25   spam = 1  # and this is the second comment
26             # ... and now a third!
27   text = "# This is not a comment because it's inside quotes."
28
29
30.. _tut-calculator:
31
32Using Python as a Calculator
33============================
34
35Let's try some simple Python commands.  Start the interpreter and wait for the
36primary prompt, ``>>>``.  (It shouldn't take long.)
37
38
39.. _tut-numbers:
40
41Numbers
42-------
43
44The interpreter acts as a simple calculator: you can type an expression at it
45and it will write the value.  Expression syntax is straightforward: the
46operators ``+``, ``-``, ``*`` and ``/`` work just like in most other languages
47(for example, Pascal or C); parentheses (``()``) can be used for grouping.
48For example::
49
50   >>> 2 + 2
51   4
52   >>> 50 - 5*6
53   20
54   >>> (50 - 5.0*6) / 4
55   5.0
56   >>> 8 / 5.0
57   1.6
58
59The integer numbers (e.g. ``2``, ``4``, ``20``) have type :class:`int`,
60the ones with a fractional part (e.g. ``5.0``, ``1.6``) have type
61:class:`float`.  We will see more about numeric types later in the tutorial.
62
63The return type of a division (``/``) operation depends on its operands.  If
64both operands are of type :class:`int`, :term:`floor division` is performed
65and an :class:`int` is returned.  If either operand is a :class:`float`,
66classic division is performed and a :class:`float` is returned.  The ``//``
67operator is also provided for doing floor division no matter what the
68operands are.  The remainder can be calculated with the ``%`` operator::
69
70   >>> 17 / 3  # int / int -> int
71   5
72   >>> 17 / 3.0  # int / float -> float
73   5.666666666666667
74   >>> 17 // 3.0  # explicit floor division discards the fractional part
75   5.0
76   >>> 17 % 3  # the % operator returns the remainder of the division
77   2
78   >>> 5 * 3 + 2  # result * divisor + remainder
79   17
80
81With Python, it is possible to use the ``**`` operator to calculate powers [#]_::
82
83   >>> 5 ** 2  # 5 squared
84   25
85   >>> 2 ** 7  # 2 to the power of 7
86   128
87
88The equal sign (``=``) is used to assign a value to a variable. Afterwards, no
89result is displayed before the next interactive prompt::
90
91   >>> width = 20
92   >>> height = 5 * 9
93   >>> width * height
94   900
95
96If a variable is not "defined" (assigned a value), trying to use it will
97give you an error::
98
99   >>> n  # try to access an undefined variable
100   Traceback (most recent call last):
101     File "<stdin>", line 1, in <module>
102   NameError: name 'n' is not defined
103
104There is full support for floating point; operators with mixed type operands
105convert the integer operand to floating point::
106
107   >>> 3 * 3.75 / 1.5
108   7.5
109   >>> 7.0 / 2
110   3.5
111
112In interactive mode, the last printed expression is assigned to the variable
113``_``.  This means that when you are using Python as a desk calculator, it is
114somewhat easier to continue calculations, for example::
115
116   >>> tax = 12.5 / 100
117   >>> price = 100.50
118   >>> price * tax
119   12.5625
120   >>> price + _
121   113.0625
122   >>> round(_, 2)
123   113.06
124
125This variable should be treated as read-only by the user.  Don't explicitly
126assign a value to it --- you would create an independent local variable with the
127same name masking the built-in variable with its magic behavior.
128
129In addition to :class:`int` and :class:`float`, Python supports other types of
130numbers, such as :class:`~decimal.Decimal` and :class:`~fractions.Fraction`.
131Python also has built-in support for :ref:`complex numbers <typesnumeric>`,
132and uses the ``j`` or ``J`` suffix to indicate the imaginary part
133(e.g. ``3+5j``).
134
135
136.. _tut-strings:
137
138Strings
139-------
140
141Besides numbers, Python can also manipulate strings, which can be expressed
142in several ways.  They can be enclosed in single quotes (``'...'``) or
143double quotes (``"..."``) with the same result [#]_.  ``\`` can be used
144to escape quotes::
145
146   >>> 'spam eggs'  # single quotes
147   'spam eggs'
148   >>> 'doesn\'t'  # use \' to escape the single quote...
149   "doesn't"
150   >>> "doesn't"  # ...or use double quotes instead
151   "doesn't"
152   >>> '"Yes," they said.'
153   '"Yes," they said.'
154   >>> "\"Yes,\" they said."
155   '"Yes," they said.'
156   >>> '"Isn\'t," they said.'
157   '"Isn\'t," they said.'
158
159In the interactive interpreter, the output string is enclosed in quotes and
160special characters are escaped with backslashes.  While this might sometimes
161look different from the input (the enclosing quotes could change), the two
162strings are equivalent.  The string is enclosed in double quotes if
163the string contains a single quote and no double quotes, otherwise it is
164enclosed in single quotes.  The :keyword:`print` statement produces a more
165readable output, by omitting the enclosing quotes and by printing escaped
166and special characters::
167
168   >>> '"Isn\'t," they said.'
169   '"Isn\'t," they said.'
170   >>> print '"Isn\'t," they said.'
171   "Isn't," they said.
172   >>> s = 'First line.\nSecond line.'  # \n means newline
173   >>> s  # without print, \n is included in the output
174   'First line.\nSecond line.'
175   >>> print s  # with print, \n produces a new line
176   First line.
177   Second line.
178
179If you don't want characters prefaced by ``\`` to be interpreted as
180special characters, you can use *raw strings* by adding an ``r`` before
181the first quote::
182
183   >>> print 'C:\some\name'  # here \n means newline!
184   C:\some
185   ame
186   >>> print r'C:\some\name'  # note the r before the quote
187   C:\some\name
188
189String literals can span multiple lines.  One way is using triple-quotes:
190``"""..."""`` or ``'''...'''``.  End of lines are automatically
191included in the string, but it's possible to prevent this by adding a ``\`` at
192the end of the line.  The following example::
193
194   print """\
195   Usage: thingy [OPTIONS]
196        -h                        Display this usage message
197        -H hostname               Hostname to connect to
198   """
199
200produces the following output (note that the initial newline is not included):
201
202.. code-block:: text
203
204   Usage: thingy [OPTIONS]
205        -h                        Display this usage message
206        -H hostname               Hostname to connect to
207
208Strings can be concatenated (glued together) with the ``+`` operator, and
209repeated with ``*``::
210
211   >>> # 3 times 'un', followed by 'ium'
212   >>> 3 * 'un' + 'ium'
213   'unununium'
214
215Two or more *string literals* (i.e. the ones enclosed between quotes) next
216to each other are automatically concatenated. ::
217
218   >>> 'Py' 'thon'
219   'Python'
220
221This feature is particularly useful when you want to break long strings::
222
223   >>> text = ('Put several strings within parentheses '
224   ...         'to have them joined together.')
225   >>> text
226   'Put several strings within parentheses to have them joined together.'
227
228This only works with two literals though, not with variables or expressions::
229
230   >>> prefix = 'Py'
231   >>> prefix 'thon'  # can't concatenate a variable and a string literal
232     ...
233   SyntaxError: invalid syntax
234   >>> ('un' * 3) 'ium'
235     ...
236   SyntaxError: invalid syntax
237
238If you want to concatenate variables or a variable and a literal, use ``+``::
239
240   >>> prefix + 'thon'
241   'Python'
242
243Strings can be *indexed* (subscripted), with the first character having index 0.
244There is no separate character type; a character is simply a string of size
245one::
246
247   >>> word = 'Python'
248   >>> word[0]  # character in position 0
249   'P'
250   >>> word[5]  # character in position 5
251   'n'
252
253Indices may also be negative numbers, to start counting from the right::
254
255   >>> word[-1]  # last character
256   'n'
257   >>> word[-2]  # second-last character
258   'o'
259   >>> word[-6]
260   'P'
261
262Note that since -0 is the same as 0, negative indices start from -1.
263
264In addition to indexing, *slicing* is also supported.  While indexing is used
265to obtain individual characters, *slicing* allows you to obtain a substring::
266
267   >>> word[0:2]  # characters from position 0 (included) to 2 (excluded)
268   'Py'
269   >>> word[2:5]  # characters from position 2 (included) to 5 (excluded)
270   'tho'
271
272Note how the start is always included, and the end always excluded.  This
273makes sure that ``s[:i] + s[i:]`` is always equal to ``s``::
274
275   >>> word[:2] + word[2:]
276   'Python'
277   >>> word[:4] + word[4:]
278   'Python'
279
280Slice indices have useful defaults; an omitted first index defaults to zero, an
281omitted second index defaults to the size of the string being sliced. ::
282
283   >>> word[:2]   # character from the beginning to position 2 (excluded)
284   'Py'
285   >>> word[4:]   # characters from position 4 (included) to the end
286   'on'
287   >>> word[-2:]  # characters from the second-last (included) to the end
288   'on'
289
290One way to remember how slices work is to think of the indices as pointing
291*between* characters, with the left edge of the first character numbered 0.
292Then the right edge of the last character of a string of *n* characters has
293index *n*, for example::
294
295    +---+---+---+---+---+---+
296    | P | y | t | h | o | n |
297    +---+---+---+---+---+---+
298    0   1   2   3   4   5   6
299   -6  -5  -4  -3  -2  -1
300
301The first row of numbers gives the position of the indices 0...6 in the string;
302the second row gives the corresponding negative indices. The slice from *i* to
303*j* consists of all characters between the edges labeled *i* and *j*,
304respectively.
305
306For non-negative indices, the length of a slice is the difference of the
307indices, if both are within bounds.  For example, the length of ``word[1:3]`` is
3082.
309
310Attempting to use an index that is too large will result in an error::
311
312   >>> word[42]  # the word only has 6 characters
313   Traceback (most recent call last):
314     File "<stdin>", line 1, in <module>
315   IndexError: string index out of range
316
317However, out of range slice indexes are handled gracefully when used for
318slicing::
319
320   >>> word[4:42]
321   'on'
322   >>> word[42:]
323   ''
324
325Python strings cannot be changed --- they are :term:`immutable`.
326Therefore, assigning to an indexed position in the string results in an error::
327
328   >>> word[0] = 'J'
329     ...
330   TypeError: 'str' object does not support item assignment
331   >>> word[2:] = 'py'
332     ...
333   TypeError: 'str' object does not support item assignment
334
335If you need a different string, you should create a new one::
336
337   >>> 'J' + word[1:]
338   'Jython'
339   >>> word[:2] + 'py'
340   'Pypy'
341
342The built-in function :func:`len` returns the length of a string::
343
344   >>> s = 'supercalifragilisticexpialidocious'
345   >>> len(s)
346   34
347
348
349.. seealso::
350
351   :ref:`typesseq`
352      Strings, and the Unicode strings described in the next section, are
353      examples of *sequence types*, and support the common operations supported
354      by such types.
355
356   :ref:`string-methods`
357      Both strings and Unicode strings support a large number of methods for
358      basic transformations and searching.
359
360   :ref:`formatstrings`
361      Information about string formatting with :meth:`str.format`.
362
363   :ref:`string-formatting`
364      The old formatting operations invoked when strings and Unicode strings are
365      the left operand of the ``%`` operator are described in more detail here.
366
367
368.. _tut-unicodestrings:
369
370Unicode Strings
371---------------
372
373.. sectionauthor:: Marc-Andre Lemburg <mal@lemburg.com>
374
375
376Starting with Python 2.0 a new data type for storing text data is available to
377the programmer: the Unicode object. It can be used to store and manipulate
378Unicode data (see http://www.unicode.org/) and integrates well with the existing
379string objects, providing auto-conversions where necessary.
380
381Unicode has the advantage of providing one ordinal for every character in every
382script used in modern and ancient texts. Previously, there were only 256
383possible ordinals for script characters. Texts were typically bound to a code
384page which mapped the ordinals to script characters. This lead to very much
385confusion especially with respect to internationalization (usually written as
386``i18n`` --- ``'i'`` + 18 characters + ``'n'``) of software.  Unicode solves
387these problems by defining one code page for all scripts.
388
389Creating Unicode strings in Python is just as simple as creating normal
390strings::
391
392   >>> u'Hello World !'
393   u'Hello World !'
394
395The small ``'u'`` in front of the quote indicates that a Unicode string is
396supposed to be created. If you want to include special characters in the string,
397you can do so by using the Python *Unicode-Escape* encoding. The following
398example shows how::
399
400   >>> u'Hello\u0020World !'
401   u'Hello World !'
402
403The escape sequence ``\u0020`` indicates to insert the Unicode character with
404the ordinal value 0x0020 (the space character) at the given position.
405
406Other characters are interpreted by using their respective ordinal values
407directly as Unicode ordinals.  If you have literal strings in the standard
408Latin-1 encoding that is used in many Western countries, you will find it
409convenient that the lower 256 characters of Unicode are the same as the 256
410characters of Latin-1.
411
412For experts, there is also a raw mode just like the one for normal strings. You
413have to prefix the opening quote with 'ur' to have Python use the
414*Raw-Unicode-Escape* encoding. It will only apply the above ``\uXXXX``
415conversion if there is an uneven number of backslashes in front of the small
416'u'. ::
417
418   >>> ur'Hello\u0020World !'
419   u'Hello World !'
420   >>> ur'Hello\\u0020World !'
421   u'Hello\\\\u0020World !'
422
423The raw mode is most useful when you have to enter lots of backslashes, as can
424be necessary in regular expressions.
425
426Apart from these standard encodings, Python provides a whole set of other ways
427of creating Unicode strings on the basis of a known encoding.
428
429.. index:: builtin: unicode
430
431The built-in function :func:`unicode` provides access to all registered Unicode
432codecs (COders and DECoders). Some of the more well known encodings which these
433codecs can convert are *Latin-1*, *ASCII*, *UTF-8*, and *UTF-16*. The latter two
434are variable-length encodings that store each Unicode character in one or more
435bytes. The default encoding is normally set to ASCII, which passes through
436characters in the range 0 to 127 and rejects any other characters with an error.
437When a Unicode string is printed, written to a file, or converted with
438:func:`str`, conversion takes place using this default encoding. ::
439
440   >>> u"abc"
441   u'abc'
442   >>> str(u"abc")
443   'abc'
444   >>> u"äöü"
445   u'\xe4\xf6\xfc'
446   >>> str(u"äöü")
447   Traceback (most recent call last):
448     File "<stdin>", line 1, in ?
449   UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)
450
451To convert a Unicode string into an 8-bit string using a specific encoding,
452Unicode objects provide an :func:`encode` method that takes one argument, the
453name of the encoding.  Lowercase names for encodings are preferred. ::
454
455   >>> u"äöü".encode('utf-8')
456   '\xc3\xa4\xc3\xb6\xc3\xbc'
457
458If you have data in a specific encoding and want to produce a corresponding
459Unicode string from it, you can use the :func:`unicode` function with the
460encoding name as the second argument. ::
461
462   >>> unicode('\xc3\xa4\xc3\xb6\xc3\xbc', 'utf-8')
463   u'\xe4\xf6\xfc'
464
465
466.. _tut-lists:
467
468Lists
469-----
470
471Python knows a number of *compound* data types, used to group together other
472values.  The most versatile is the *list*, which can be written as a list of
473comma-separated values (items) between square brackets.  Lists might contain
474items of different types, but usually the items all have the same type. ::
475
476   >>> squares = [1, 4, 9, 16, 25]
477   >>> squares
478   [1, 4, 9, 16, 25]
479
480Like strings (and all other built-in :term:`sequence` type), lists can be
481indexed and sliced::
482
483   >>> squares[0]  # indexing returns the item
484   1
485   >>> squares[-1]
486   25
487   >>> squares[-3:]  # slicing returns a new list
488   [9, 16, 25]
489
490All slice operations return a new list containing the requested elements.  This
491means that the following slice returns a new (shallow) copy of the list::
492
493   >>> squares[:]
494   [1, 4, 9, 16, 25]
495
496Lists also supports operations like concatenation::
497
498   >>> squares + [36, 49, 64, 81, 100]
499   [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
500
501Unlike strings, which are :term:`immutable`, lists are a :term:`mutable`
502type, i.e. it is possible to change their content::
503
504    >>> cubes = [1, 8, 27, 65, 125]  # something's wrong here
505    >>> 4 ** 3  # the cube of 4 is 64, not 65!
506    64
507    >>> cubes[3] = 64  # replace the wrong value
508    >>> cubes
509    [1, 8, 27, 64, 125]
510
511You can also add new items at the end of the list, by using
512the :meth:`~list.append` *method* (we will see more about methods later)::
513
514   >>> cubes.append(216)  # add the cube of 6
515   >>> cubes.append(7 ** 3)  # and the cube of 7
516   >>> cubes
517   [1, 8, 27, 64, 125, 216, 343]
518
519Assignment to slices is also possible, and this can even change the size of the
520list or clear it entirely::
521
522   >>> letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
523   >>> letters
524   ['a', 'b', 'c', 'd', 'e', 'f', 'g']
525   >>> # replace some values
526   >>> letters[2:5] = ['C', 'D', 'E']
527   >>> letters
528   ['a', 'b', 'C', 'D', 'E', 'f', 'g']
529   >>> # now remove them
530   >>> letters[2:5] = []
531   >>> letters
532   ['a', 'b', 'f', 'g']
533   >>> # clear the list by replacing all the elements with an empty list
534   >>> letters[:] = []
535   >>> letters
536   []
537
538The built-in function :func:`len` also applies to lists::
539
540   >>> letters = ['a', 'b', 'c', 'd']
541   >>> len(letters)
542   4
543
544It is possible to nest lists (create lists containing other lists), for
545example::
546
547   >>> a = ['a', 'b', 'c']
548   >>> n = [1, 2, 3]
549   >>> x = [a, n]
550   >>> x
551   [['a', 'b', 'c'], [1, 2, 3]]
552   >>> x[0]
553   ['a', 'b', 'c']
554   >>> x[0][1]
555   'b'
556
557.. _tut-firststeps:
558
559First Steps Towards Programming
560===============================
561
562Of course, we can use Python for more complicated tasks than adding two and two
563together.  For instance, we can write an initial sub-sequence of the *Fibonacci*
564series as follows::
565
566   >>> # Fibonacci series:
567   ... # the sum of two elements defines the next
568   ... a, b = 0, 1
569   >>> while b < 10:
570   ...     print b
571   ...     a, b = b, a+b
572   ...
573   1
574   1
575   2
576   3
577   5
578   8
579
580This example introduces several new features.
581
582* The first line contains a *multiple assignment*: the variables ``a`` and ``b``
583  simultaneously get the new values 0 and 1.  On the last line this is used again,
584  demonstrating that the expressions on the right-hand side are all evaluated
585  first before any of the assignments take place.  The right-hand side expressions
586  are evaluated  from the left to the right.
587
588* The :keyword:`while` loop executes as long as the condition (here: ``b < 10``)
589  remains true.  In Python, like in C, any non-zero integer value is true; zero is
590  false.  The condition may also be a string or list value, in fact any sequence;
591  anything with a non-zero length is true, empty sequences are false.  The test
592  used in the example is a simple comparison.  The standard comparison operators
593  are written the same as in C: ``<`` (less than), ``>`` (greater than), ``==``
594  (equal to), ``<=`` (less than or equal to), ``>=`` (greater than or equal to)
595  and ``!=`` (not equal to).
596
597* The *body* of the loop is *indented*: indentation is Python's way of grouping
598  statements.  At the interactive prompt, you have to type a tab or space(s) for
599  each indented line.  In practice you will prepare more complicated input
600  for Python with a text editor; all decent text editors have an auto-indent
601  facility.  When a compound statement is entered interactively, it must be
602  followed by a blank line to indicate completion (since the parser cannot
603  guess when you have typed the last line).  Note that each line within a basic
604  block must be indented by the same amount.
605
606* The :keyword:`print` statement writes the value of the expression(s) it is
607  given.  It differs from just writing the expression you want to write (as we did
608  earlier in the calculator examples) in the way it handles multiple expressions
609  and strings.  Strings are printed without quotes, and a space is inserted
610  between items, so you can format things nicely, like this::
611
612     >>> i = 256*256
613     >>> print 'The value of i is', i
614     The value of i is 65536
615
616  A trailing comma avoids the newline after the output::
617
618     >>> a, b = 0, 1
619     >>> while b < 1000:
620     ...     print b,
621     ...     a, b = b, a+b
622     ...
623     1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
624
625  Note that the interpreter inserts a newline before it prints the next prompt if
626  the last line was not completed.
627
628.. rubric:: Footnotes
629
630.. [#] Since ``**`` has higher precedence than ``-``, ``-3**2`` will be
631   interpreted as ``-(3**2)`` and thus result in ``-9``.  To avoid this
632   and get ``9``, you can use ``(-3)**2``.
633
634.. [#] Unlike other languages, special characters such as ``\n`` have the
635   same meaning with both single (``'...'``) and double (``"..."``) quotes.
636   The only difference between the two is that within single quotes you don't
637   need to escape ``"`` (but you have to escape ``\'``) and vice versa.
638