1.. _tut-informal: 2 3********************************** 4An Informal Introduction to Python 5********************************** 6 7In the following examples, input and output are distinguished by the presence or 8absence of prompts (:term:`>>>` and :term:`...`): to repeat the example, you must type 9everything after the prompt, when the prompt appears; lines that do not begin 10with a prompt are output from the interpreter. Note that a secondary prompt on a 11line by itself in an example means you must type a blank line; this is used to 12end a multi-line command. 13 14Many of the examples in this manual, even those entered at the interactive 15prompt, include comments. Comments in Python start with the hash character, 16``#``, and extend to the end of the physical line. A comment may appear at the 17start of a line or following whitespace or code, but not within a string 18literal. A hash character within a string literal is just a hash character. 19Since comments are to clarify code and are not interpreted by Python, they may 20be omitted when typing in examples. 21 22Some examples:: 23 24 # this is the first comment 25 spam = 1 # and this is the second comment 26 # ... and now a third! 27 text = "# This is not a comment because it's inside quotes." 28 29 30.. _tut-calculator: 31 32Using Python as a Calculator 33============================ 34 35Let's try some simple Python commands. Start the interpreter and wait for the 36primary prompt, ``>>>``. (It shouldn't take long.) 37 38 39.. _tut-numbers: 40 41Numbers 42------- 43 44The interpreter acts as a simple calculator: you can type an expression at it 45and it will write the value. Expression syntax is straightforward: the 46operators ``+``, ``-``, ``*`` and ``/`` work just like in most other languages 47(for example, Pascal or C); parentheses (``()``) can be used for grouping. 48For example:: 49 50 >>> 2 + 2 51 4 52 >>> 50 - 5*6 53 20 54 >>> (50 - 5.0*6) / 4 55 5.0 56 >>> 8 / 5.0 57 1.6 58 59The integer numbers (e.g. ``2``, ``4``, ``20``) have type :class:`int`, 60the ones with a fractional part (e.g. ``5.0``, ``1.6``) have type 61:class:`float`. We will see more about numeric types later in the tutorial. 62 63The return type of a division (``/``) operation depends on its operands. If 64both operands are of type :class:`int`, :term:`floor division` is performed 65and an :class:`int` is returned. If either operand is a :class:`float`, 66classic division is performed and a :class:`float` is returned. The ``//`` 67operator is also provided for doing floor division no matter what the 68operands are. The remainder can be calculated with the ``%`` operator:: 69 70 >>> 17 / 3 # int / int -> int 71 5 72 >>> 17 / 3.0 # int / float -> float 73 5.666666666666667 74 >>> 17 // 3.0 # explicit floor division discards the fractional part 75 5.0 76 >>> 17 % 3 # the % operator returns the remainder of the division 77 2 78 >>> 5 * 3 + 2 # result * divisor + remainder 79 17 80 81With Python, it is possible to use the ``**`` operator to calculate powers [#]_:: 82 83 >>> 5 ** 2 # 5 squared 84 25 85 >>> 2 ** 7 # 2 to the power of 7 86 128 87 88The equal sign (``=``) is used to assign a value to a variable. Afterwards, no 89result is displayed before the next interactive prompt:: 90 91 >>> width = 20 92 >>> height = 5 * 9 93 >>> width * height 94 900 95 96If a variable is not "defined" (assigned a value), trying to use it will 97give you an error:: 98 99 >>> n # try to access an undefined variable 100 Traceback (most recent call last): 101 File "<stdin>", line 1, in <module> 102 NameError: name 'n' is not defined 103 104There is full support for floating point; operators with mixed type operands 105convert the integer operand to floating point:: 106 107 >>> 3 * 3.75 / 1.5 108 7.5 109 >>> 7.0 / 2 110 3.5 111 112In interactive mode, the last printed expression is assigned to the variable 113``_``. This means that when you are using Python as a desk calculator, it is 114somewhat easier to continue calculations, for example:: 115 116 >>> tax = 12.5 / 100 117 >>> price = 100.50 118 >>> price * tax 119 12.5625 120 >>> price + _ 121 113.0625 122 >>> round(_, 2) 123 113.06 124 125This variable should be treated as read-only by the user. Don't explicitly 126assign a value to it --- you would create an independent local variable with the 127same name masking the built-in variable with its magic behavior. 128 129In addition to :class:`int` and :class:`float`, Python supports other types of 130numbers, such as :class:`~decimal.Decimal` and :class:`~fractions.Fraction`. 131Python also has built-in support for :ref:`complex numbers <typesnumeric>`, 132and uses the ``j`` or ``J`` suffix to indicate the imaginary part 133(e.g. ``3+5j``). 134 135 136.. _tut-strings: 137 138Strings 139------- 140 141Besides numbers, Python can also manipulate strings, which can be expressed 142in several ways. They can be enclosed in single quotes (``'...'``) or 143double quotes (``"..."``) with the same result [#]_. ``\`` can be used 144to escape quotes:: 145 146 >>> 'spam eggs' # single quotes 147 'spam eggs' 148 >>> 'doesn\'t' # use \' to escape the single quote... 149 "doesn't" 150 >>> "doesn't" # ...or use double quotes instead 151 "doesn't" 152 >>> '"Yes," they said.' 153 '"Yes," they said.' 154 >>> "\"Yes,\" they said." 155 '"Yes," they said.' 156 >>> '"Isn\'t," they said.' 157 '"Isn\'t," they said.' 158 159In the interactive interpreter, the output string is enclosed in quotes and 160special characters are escaped with backslashes. While this might sometimes 161look different from the input (the enclosing quotes could change), the two 162strings are equivalent. The string is enclosed in double quotes if 163the string contains a single quote and no double quotes, otherwise it is 164enclosed in single quotes. The :keyword:`print` statement produces a more 165readable output, by omitting the enclosing quotes and by printing escaped 166and special characters:: 167 168 >>> '"Isn\'t," they said.' 169 '"Isn\'t," they said.' 170 >>> print '"Isn\'t," they said.' 171 "Isn't," they said. 172 >>> s = 'First line.\nSecond line.' # \n means newline 173 >>> s # without print, \n is included in the output 174 'First line.\nSecond line.' 175 >>> print s # with print, \n produces a new line 176 First line. 177 Second line. 178 179If you don't want characters prefaced by ``\`` to be interpreted as 180special characters, you can use *raw strings* by adding an ``r`` before 181the first quote:: 182 183 >>> print 'C:\some\name' # here \n means newline! 184 C:\some 185 ame 186 >>> print r'C:\some\name' # note the r before the quote 187 C:\some\name 188 189String literals can span multiple lines. One way is using triple-quotes: 190``"""..."""`` or ``'''...'''``. End of lines are automatically 191included in the string, but it's possible to prevent this by adding a ``\`` at 192the end of the line. The following example:: 193 194 print """\ 195 Usage: thingy [OPTIONS] 196 -h Display this usage message 197 -H hostname Hostname to connect to 198 """ 199 200produces the following output (note that the initial newline is not included): 201 202.. code-block:: text 203 204 Usage: thingy [OPTIONS] 205 -h Display this usage message 206 -H hostname Hostname to connect to 207 208Strings can be concatenated (glued together) with the ``+`` operator, and 209repeated with ``*``:: 210 211 >>> # 3 times 'un', followed by 'ium' 212 >>> 3 * 'un' + 'ium' 213 'unununium' 214 215Two or more *string literals* (i.e. the ones enclosed between quotes) next 216to each other are automatically concatenated. :: 217 218 >>> 'Py' 'thon' 219 'Python' 220 221This feature is particularly useful when you want to break long strings:: 222 223 >>> text = ('Put several strings within parentheses ' 224 ... 'to have them joined together.') 225 >>> text 226 'Put several strings within parentheses to have them joined together.' 227 228This only works with two literals though, not with variables or expressions:: 229 230 >>> prefix = 'Py' 231 >>> prefix 'thon' # can't concatenate a variable and a string literal 232 ... 233 SyntaxError: invalid syntax 234 >>> ('un' * 3) 'ium' 235 ... 236 SyntaxError: invalid syntax 237 238If you want to concatenate variables or a variable and a literal, use ``+``:: 239 240 >>> prefix + 'thon' 241 'Python' 242 243Strings can be *indexed* (subscripted), with the first character having index 0. 244There is no separate character type; a character is simply a string of size 245one:: 246 247 >>> word = 'Python' 248 >>> word[0] # character in position 0 249 'P' 250 >>> word[5] # character in position 5 251 'n' 252 253Indices may also be negative numbers, to start counting from the right:: 254 255 >>> word[-1] # last character 256 'n' 257 >>> word[-2] # second-last character 258 'o' 259 >>> word[-6] 260 'P' 261 262Note that since -0 is the same as 0, negative indices start from -1. 263 264In addition to indexing, *slicing* is also supported. While indexing is used 265to obtain individual characters, *slicing* allows you to obtain a substring:: 266 267 >>> word[0:2] # characters from position 0 (included) to 2 (excluded) 268 'Py' 269 >>> word[2:5] # characters from position 2 (included) to 5 (excluded) 270 'tho' 271 272Note how the start is always included, and the end always excluded. This 273makes sure that ``s[:i] + s[i:]`` is always equal to ``s``:: 274 275 >>> word[:2] + word[2:] 276 'Python' 277 >>> word[:4] + word[4:] 278 'Python' 279 280Slice indices have useful defaults; an omitted first index defaults to zero, an 281omitted second index defaults to the size of the string being sliced. :: 282 283 >>> word[:2] # character from the beginning to position 2 (excluded) 284 'Py' 285 >>> word[4:] # characters from position 4 (included) to the end 286 'on' 287 >>> word[-2:] # characters from the second-last (included) to the end 288 'on' 289 290One way to remember how slices work is to think of the indices as pointing 291*between* characters, with the left edge of the first character numbered 0. 292Then the right edge of the last character of a string of *n* characters has 293index *n*, for example:: 294 295 +---+---+---+---+---+---+ 296 | P | y | t | h | o | n | 297 +---+---+---+---+---+---+ 298 0 1 2 3 4 5 6 299 -6 -5 -4 -3 -2 -1 300 301The first row of numbers gives the position of the indices 0...6 in the string; 302the second row gives the corresponding negative indices. The slice from *i* to 303*j* consists of all characters between the edges labeled *i* and *j*, 304respectively. 305 306For non-negative indices, the length of a slice is the difference of the 307indices, if both are within bounds. For example, the length of ``word[1:3]`` is 3082. 309 310Attempting to use an index that is too large will result in an error:: 311 312 >>> word[42] # the word only has 6 characters 313 Traceback (most recent call last): 314 File "<stdin>", line 1, in <module> 315 IndexError: string index out of range 316 317However, out of range slice indexes are handled gracefully when used for 318slicing:: 319 320 >>> word[4:42] 321 'on' 322 >>> word[42:] 323 '' 324 325Python strings cannot be changed --- they are :term:`immutable`. 326Therefore, assigning to an indexed position in the string results in an error:: 327 328 >>> word[0] = 'J' 329 ... 330 TypeError: 'str' object does not support item assignment 331 >>> word[2:] = 'py' 332 ... 333 TypeError: 'str' object does not support item assignment 334 335If you need a different string, you should create a new one:: 336 337 >>> 'J' + word[1:] 338 'Jython' 339 >>> word[:2] + 'py' 340 'Pypy' 341 342The built-in function :func:`len` returns the length of a string:: 343 344 >>> s = 'supercalifragilisticexpialidocious' 345 >>> len(s) 346 34 347 348 349.. seealso:: 350 351 :ref:`typesseq` 352 Strings, and the Unicode strings described in the next section, are 353 examples of *sequence types*, and support the common operations supported 354 by such types. 355 356 :ref:`string-methods` 357 Both strings and Unicode strings support a large number of methods for 358 basic transformations and searching. 359 360 :ref:`formatstrings` 361 Information about string formatting with :meth:`str.format`. 362 363 :ref:`string-formatting` 364 The old formatting operations invoked when strings and Unicode strings are 365 the left operand of the ``%`` operator are described in more detail here. 366 367 368.. _tut-unicodestrings: 369 370Unicode Strings 371--------------- 372 373.. sectionauthor:: Marc-Andre Lemburg <mal@lemburg.com> 374 375 376Starting with Python 2.0 a new data type for storing text data is available to 377the programmer: the Unicode object. It can be used to store and manipulate 378Unicode data (see http://www.unicode.org/) and integrates well with the existing 379string objects, providing auto-conversions where necessary. 380 381Unicode has the advantage of providing one ordinal for every character in every 382script used in modern and ancient texts. Previously, there were only 256 383possible ordinals for script characters. Texts were typically bound to a code 384page which mapped the ordinals to script characters. This lead to very much 385confusion especially with respect to internationalization (usually written as 386``i18n`` --- ``'i'`` + 18 characters + ``'n'``) of software. Unicode solves 387these problems by defining one code page for all scripts. 388 389Creating Unicode strings in Python is just as simple as creating normal 390strings:: 391 392 >>> u'Hello World !' 393 u'Hello World !' 394 395The small ``'u'`` in front of the quote indicates that a Unicode string is 396supposed to be created. If you want to include special characters in the string, 397you can do so by using the Python *Unicode-Escape* encoding. The following 398example shows how:: 399 400 >>> u'Hello\u0020World !' 401 u'Hello World !' 402 403The escape sequence ``\u0020`` indicates to insert the Unicode character with 404the ordinal value 0x0020 (the space character) at the given position. 405 406Other characters are interpreted by using their respective ordinal values 407directly as Unicode ordinals. If you have literal strings in the standard 408Latin-1 encoding that is used in many Western countries, you will find it 409convenient that the lower 256 characters of Unicode are the same as the 256 410characters of Latin-1. 411 412For experts, there is also a raw mode just like the one for normal strings. You 413have to prefix the opening quote with 'ur' to have Python use the 414*Raw-Unicode-Escape* encoding. It will only apply the above ``\uXXXX`` 415conversion if there is an uneven number of backslashes in front of the small 416'u'. :: 417 418 >>> ur'Hello\u0020World !' 419 u'Hello World !' 420 >>> ur'Hello\\u0020World !' 421 u'Hello\\\\u0020World !' 422 423The raw mode is most useful when you have to enter lots of backslashes, as can 424be necessary in regular expressions. 425 426Apart from these standard encodings, Python provides a whole set of other ways 427of creating Unicode strings on the basis of a known encoding. 428 429.. index:: builtin: unicode 430 431The built-in function :func:`unicode` provides access to all registered Unicode 432codecs (COders and DECoders). Some of the more well known encodings which these 433codecs can convert are *Latin-1*, *ASCII*, *UTF-8*, and *UTF-16*. The latter two 434are variable-length encodings that store each Unicode character in one or more 435bytes. The default encoding is normally set to ASCII, which passes through 436characters in the range 0 to 127 and rejects any other characters with an error. 437When a Unicode string is printed, written to a file, or converted with 438:func:`str`, conversion takes place using this default encoding. :: 439 440 >>> u"abc" 441 u'abc' 442 >>> str(u"abc") 443 'abc' 444 >>> u"äöü" 445 u'\xe4\xf6\xfc' 446 >>> str(u"äöü") 447 Traceback (most recent call last): 448 File "<stdin>", line 1, in ? 449 UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128) 450 451To convert a Unicode string into an 8-bit string using a specific encoding, 452Unicode objects provide an :func:`encode` method that takes one argument, the 453name of the encoding. Lowercase names for encodings are preferred. :: 454 455 >>> u"äöü".encode('utf-8') 456 '\xc3\xa4\xc3\xb6\xc3\xbc' 457 458If you have data in a specific encoding and want to produce a corresponding 459Unicode string from it, you can use the :func:`unicode` function with the 460encoding name as the second argument. :: 461 462 >>> unicode('\xc3\xa4\xc3\xb6\xc3\xbc', 'utf-8') 463 u'\xe4\xf6\xfc' 464 465 466.. _tut-lists: 467 468Lists 469----- 470 471Python knows a number of *compound* data types, used to group together other 472values. The most versatile is the *list*, which can be written as a list of 473comma-separated values (items) between square brackets. Lists might contain 474items of different types, but usually the items all have the same type. :: 475 476 >>> squares = [1, 4, 9, 16, 25] 477 >>> squares 478 [1, 4, 9, 16, 25] 479 480Like strings (and all other built-in :term:`sequence` type), lists can be 481indexed and sliced:: 482 483 >>> squares[0] # indexing returns the item 484 1 485 >>> squares[-1] 486 25 487 >>> squares[-3:] # slicing returns a new list 488 [9, 16, 25] 489 490All slice operations return a new list containing the requested elements. This 491means that the following slice returns a new (shallow) copy of the list:: 492 493 >>> squares[:] 494 [1, 4, 9, 16, 25] 495 496Lists also supports operations like concatenation:: 497 498 >>> squares + [36, 49, 64, 81, 100] 499 [1, 4, 9, 16, 25, 36, 49, 64, 81, 100] 500 501Unlike strings, which are :term:`immutable`, lists are a :term:`mutable` 502type, i.e. it is possible to change their content:: 503 504 >>> cubes = [1, 8, 27, 65, 125] # something's wrong here 505 >>> 4 ** 3 # the cube of 4 is 64, not 65! 506 64 507 >>> cubes[3] = 64 # replace the wrong value 508 >>> cubes 509 [1, 8, 27, 64, 125] 510 511You can also add new items at the end of the list, by using 512the :meth:`~list.append` *method* (we will see more about methods later):: 513 514 >>> cubes.append(216) # add the cube of 6 515 >>> cubes.append(7 ** 3) # and the cube of 7 516 >>> cubes 517 [1, 8, 27, 64, 125, 216, 343] 518 519Assignment to slices is also possible, and this can even change the size of the 520list or clear it entirely:: 521 522 >>> letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g'] 523 >>> letters 524 ['a', 'b', 'c', 'd', 'e', 'f', 'g'] 525 >>> # replace some values 526 >>> letters[2:5] = ['C', 'D', 'E'] 527 >>> letters 528 ['a', 'b', 'C', 'D', 'E', 'f', 'g'] 529 >>> # now remove them 530 >>> letters[2:5] = [] 531 >>> letters 532 ['a', 'b', 'f', 'g'] 533 >>> # clear the list by replacing all the elements with an empty list 534 >>> letters[:] = [] 535 >>> letters 536 [] 537 538The built-in function :func:`len` also applies to lists:: 539 540 >>> letters = ['a', 'b', 'c', 'd'] 541 >>> len(letters) 542 4 543 544It is possible to nest lists (create lists containing other lists), for 545example:: 546 547 >>> a = ['a', 'b', 'c'] 548 >>> n = [1, 2, 3] 549 >>> x = [a, n] 550 >>> x 551 [['a', 'b', 'c'], [1, 2, 3]] 552 >>> x[0] 553 ['a', 'b', 'c'] 554 >>> x[0][1] 555 'b' 556 557.. _tut-firststeps: 558 559First Steps Towards Programming 560=============================== 561 562Of course, we can use Python for more complicated tasks than adding two and two 563together. For instance, we can write an initial sub-sequence of the *Fibonacci* 564series as follows:: 565 566 >>> # Fibonacci series: 567 ... # the sum of two elements defines the next 568 ... a, b = 0, 1 569 >>> while b < 10: 570 ... print b 571 ... a, b = b, a+b 572 ... 573 1 574 1 575 2 576 3 577 5 578 8 579 580This example introduces several new features. 581 582* The first line contains a *multiple assignment*: the variables ``a`` and ``b`` 583 simultaneously get the new values 0 and 1. On the last line this is used again, 584 demonstrating that the expressions on the right-hand side are all evaluated 585 first before any of the assignments take place. The right-hand side expressions 586 are evaluated from the left to the right. 587 588* The :keyword:`while` loop executes as long as the condition (here: ``b < 10``) 589 remains true. In Python, like in C, any non-zero integer value is true; zero is 590 false. The condition may also be a string or list value, in fact any sequence; 591 anything with a non-zero length is true, empty sequences are false. The test 592 used in the example is a simple comparison. The standard comparison operators 593 are written the same as in C: ``<`` (less than), ``>`` (greater than), ``==`` 594 (equal to), ``<=`` (less than or equal to), ``>=`` (greater than or equal to) 595 and ``!=`` (not equal to). 596 597* The *body* of the loop is *indented*: indentation is Python's way of grouping 598 statements. At the interactive prompt, you have to type a tab or space(s) for 599 each indented line. In practice you will prepare more complicated input 600 for Python with a text editor; all decent text editors have an auto-indent 601 facility. When a compound statement is entered interactively, it must be 602 followed by a blank line to indicate completion (since the parser cannot 603 guess when you have typed the last line). Note that each line within a basic 604 block must be indented by the same amount. 605 606* The :keyword:`print` statement writes the value of the expression(s) it is 607 given. It differs from just writing the expression you want to write (as we did 608 earlier in the calculator examples) in the way it handles multiple expressions 609 and strings. Strings are printed without quotes, and a space is inserted 610 between items, so you can format things nicely, like this:: 611 612 >>> i = 256*256 613 >>> print 'The value of i is', i 614 The value of i is 65536 615 616 A trailing comma avoids the newline after the output:: 617 618 >>> a, b = 0, 1 619 >>> while b < 1000: 620 ... print b, 621 ... a, b = b, a+b 622 ... 623 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 624 625 Note that the interpreter inserts a newline before it prints the next prompt if 626 the last line was not completed. 627 628.. rubric:: Footnotes 629 630.. [#] Since ``**`` has higher precedence than ``-``, ``-3**2`` will be 631 interpreted as ``-(3**2)`` and thus result in ``-9``. To avoid this 632 and get ``9``, you can use ``(-3)**2``. 633 634.. [#] Unlike other languages, special characters such as ``\n`` have the 635 same meaning with both single (``'...'``) and double (``"..."``) quotes. 636 The only difference between the two is that within single quotes you don't 637 need to escape ``"`` (but you have to escape ``\'``) and vice versa. 638