1.. _tutorials.gettingstarted.usingdatastore:
2
3Using the Datastore
4===================
5Storing data in a scalable web application can be tricky. A user could be
6interacting with any of dozens of web servers at a given time, and the user's
7next request could go to a different web server than the one that handled the
8previous request. All web servers need to be interacting with data that is
9also spread out across dozens of machines, possibly in different locations
10around the world.
11
12Thanks to Google App Engine, you don't have to worry about any of that.
13App Engine's infrastructure takes care of all of the distribution, replication
14and load balancing of data behind a simple API -- and you get a powerful
15query engine and transactions as well.
16
17The default datastore for an application is now the `High Replication datastore <http://code.google.com/appengine/docs/python/datastore/hr/>`_.
18This datastore uses the `Paxos algorithm <http://labs.google.com/papers/paxos_made_live.html>`_
19to replicate data across datacenters. The High Replication datastore is
20extremely resilient in the face of catastrophic failure.
21
22One of the consequences of this is that the consistency guarantee for the
23datastore may differ from what you are familiar with. It also differs slightly
24from the Master/Slave datastore, the other datastore option that App Engine
25offers. In the example code comments, we highlight some ways this might affect
26the design of your app. For more detailed information,
27see `Using the High Replication Datastore <http://code.google.com/appengine/docs/python/datastore/hr/overview.html>`_
28(HRD).
29
30The datastore writes data in objects known as entities, and each entity has a
31key that identifies the entity. Entities can belong to the same entity group,
32which allows you to perform a single transaction with multiple entities.
33Entity groups have a parent key that identifies the entire entity group.
34
35In the High Replication Datastore, entity groups are also a unit of
36consistency. Queries over multiple entity groups may return stale, `eventually consistent <http://en.wikipedia.org/wiki/Eventual_consistency>`_
37results. Queries over a single entity group return up-to-date, strongly
38consistent, results. Queries over a single entity group are called ancestor
39queries. Ancestor queries use the parent key (instead of a specific entity's
40key).
41
42The code samples in this guide organize like entities into entity groups, and
43use ancestor queries on those entity groups to return strongly consistent
44results. In the example code comments, we highlight some ways this might affect
45the design of your app. For more detailed information,
46see `Using the High Replication Datastore <http://code.google.com/appengine/docs/python/datastore/hr/overview.html>`_.
47
48
49A Complete Example Using the Datastore
50--------------------------------------
51Here is a new version of ``helloworld/helloworld.py`` that stores greetings
52in the datastore. The rest of this page discusses the new pieces::
53
54    import cgi
55    import datetime
56    import urllib
57    import wsgiref.handlers
58
59    from google.appengine.ext import db
60    from google.appengine.api import users
61    import webapp2
62
63
64    class Greeting(db.Model):
65      """Models an individual Guestbook entry with an author, content, and date."""
66      author = db.UserProperty()
67      content = db.StringProperty(multiline=True)
68      date = db.DateTimeProperty(auto_now_add=True)
69
70
71    def guestbook_key(guestbook_name=None):
72      """Constructs a datastore key for a Guestbook entity with guestbook_name."""
73      return db.Key.from_path('Guestbook', guestbook_name or 'default_guestbook')
74
75
76    class MainPage(webapp2.RequestHandler):
77      def get(self):
78        self.response.out.write('<html><body>')
79        guestbook_name=self.request.get('guestbook_name')
80
81        # Ancestor Queries, as shown here, are strongly consistent with the High
82        # Replication datastore. Queries that span entity groups are eventually
83        # consistent. If we omitted the ancestor from this query there would be a
84        # slight chance that Greeting that had just been written would not show up
85        # in a query.
86        greetings = db.GqlQuery("SELECT * "
87                                "FROM Greeting "
88                                "WHERE ANCESTOR IS :1 "
89                                "ORDER BY date DESC LIMIT 10",
90                                guestbook_key(guestbook_name))
91
92        for greeting in greetings:
93          if greeting.author:
94            self.response.out.write(
95                '<b>%s</b> wrote:' % greeting.author.nickname())
96          else:
97            self.response.out.write('An anonymous person wrote:')
98          self.response.out.write('<blockquote>%s</blockquote>' %
99                                  cgi.escape(greeting.content))
100
101        self.response.out.write("""
102              <form action="/sign?%s" method="post">
103                <div><textarea name="content" rows="3" cols="60"></textarea></div>
104                <div><input type="submit" value="Sign Guestbook"></div>
105              </form>
106              <hr>
107              <form>Guestbook name: <input value="%s" name="guestbook_name">
108              <input type="submit" value="switch"></form>
109            </body>
110          </html>""" % (urllib.urlencode({'guestbook_name': guestbook_name}),
111                              cgi.escape(guestbook_name)))
112
113
114    class Guestbook(webapp2.RequestHandler):
115      def post(self):
116        # We set the same parent key on the 'Greeting' to ensure each greeting is in
117        # the same entity group. Queries across the single entity group will be
118        # consistent. However, the write rate to a single entity group should
119        # be limited to ~1/second.
120        guestbook_name = self.request.get('guestbook_name')
121        greeting = Greeting(parent=guestbook_key(guestbook_name))
122
123        if users.get_current_user():
124          greeting.author = users.get_current_user()
125
126        greeting.content = self.request.get('content')
127        greeting.put()
128        self.redirect('/?' + urllib.urlencode({'guestbook_name': guestbook_name}))
129
130
131    application = webapp2.WSGIApplication([
132      ('/', MainPage),
133      ('/sign', Guestbook)
134    ], debug=True)
135
136
137    def main():
138      application.RUN()
139
140
141    if __name__ == '__main__':
142      main()
143
144Replace ``helloworld/helloworld.py`` with this, then reload
145`http://localhost:8080/ <http://localhost:8080/>`_ in your browser. Post a
146few messages to verify that messages get stored and displayed correctly.
147
148
149Storing the Submitted Greetings
150-------------------------------
151App Engine includes a data modeling API for Python. It's similar to Django's
152data modeling API, but uses App Engine's scalable datastore behind the scenes.
153
154For the guestbook application, we want to store greetings posted by users.
155Each greeting includes the author's name, the message content, and the date
156and time the message was posted so we can display messages in chronological
157order.
158
159To use the data modeling API, import the ``google.appengine.ext.db`` module::
160
161    from google.appengine.ext import db
162
163The following defines a data model for a greeting::
164
165    class Greeting(db.Model):
166        author = db.UserProperty()
167        content = db.StringProperty(multiline=True)
168        date = db.DateTimeProperty(auto_now_add=True)
169
170This defines a ``Greeting`` model with three properties: ``author`` whose
171value is a ``User`` object, content whose value is a string, and ``date`` whose
172value is a ``datetime.datetime``.
173
174Some property constructors take parameters to further configure their behavior.
175Giving the ``db.StringProperty`` constructor the ``multiline=True`` parameter
176says that values for this property can contain newline characters. Giving the
177``db.DateTimeProperty`` constructor a ``auto_now_add=True`` parameter
178configures the model to automatically give new objects a ``date`` of the time
179the object is created, if the application doesn't otherwise provide a value.
180For a complete list of property types and their options, see `the Datastore reference <http://code.google.com/appengine/docs/python/datastore/>`_.
181
182Now that we have a data model for greetings, the application can use the model
183to create new ``Greeting`` objects and put them into the datastore. The following
184new version of the ``Guestbook`` handler creates new greetings and saves them
185to the datastore::
186
187    class Guestbook(webapp2.RequestHandler):
188        def post(self):
189          guestbook_name = self.request.get('guestbook_name')
190          greeting = Greeting(parent=guestbook_key(guestbook_name))
191
192          if users.get_current_user():
193            greeting.author = users.get_current_user()
194
195          greeting.content = self.request.get('content')
196          greeting.put()
197          self.redirect('/?' + urllib.urlencode({'guestbook_name': guestbook_name}))
198
199This new ``Guestbook`` handler creates a new ``Greeting`` object, then sets its
200``author`` and ``content`` properties with the data posted by the user.
201The parent has an entity kind "Guestbook". There is no need to create the
202"Guestbook" entity before setting it to be the parent of another entity. In
203this example, the parent is used as a placeholder for transaction and
204consistency purposes. See `Entity Groups and Ancestor Paths <http://code.google.com/appengine/docs/python/datastore/entities.html#Entity_Groups_and_Ancestor_Paths>`_
205for more information. Objects that share a common `ancestor <http://code.google.com/appengine/docs/python/datastore/queryclass.html#Query_ancestor>`_
206belong to the same entity group. It does not set the date property, so date is
207automatically set to "now," as we configured the model to do.
208
209Finally, ``greeting.put()`` saves our new object to the datastore. If we had
210acquired this object from a query, ``put()`` would have updated the existing
211object. Since we created this object with the model constructor, ``put()`` adds
212the new object to the datastore.
213
214Because querying in the High Replication datastore is only strongly consistent
215within entity groups, we assign all Greetings to the same entity group in this
216example by setting the same parent for each Greeting. This means a user will
217always see a Greeting immediately after it was written. However, the rate at
218which you can write to the same entity group is limited to 1 write to the
219entity group per second. When you design a real application you'll need to
220keep this fact in mind. Note that by using services such as `Memcache <http://code.google.com/appengine/docs/python/memcache/>`_,
221you can mitigate the chance that a user won't see fresh results when querying
222across entity groups immediately after a write.
223
224
225Retrieving the Stored Greetings With GQL
226----------------------------------------
227The App Engine datastore has a sophisticated query engine for data models.
228Because the App Engine datastore is not a traditional relational database,
229queries are not specified using SQL. Instead, you can prepare queries using a
230SQL-like query language we call GQL. GQL provides access to the App Engine
231datastore query engine's features using a familiar syntax.
232
233The following new version of the ``MainPage`` handler queries the datastore
234for greetings::
235
236    class MainPage(webapp2.RequestHandler):
237        def get(self):
238            self.response.out.write('<html><body>')
239            guestbook_name=self.request.get('guestbook_name')
240
241            greetings = db.GqlQuery("SELECT * "
242                                    "FROM Greeting "
243                                    "WHERE ANCESTOR IS :1 "
244                                    "ORDER BY date DESC LIMIT 10",
245                                    guestbook_key(guestbook_name))
246
247
248            for greeting in greetings:
249                if greeting.author:
250                    self.response.out.write('<b>%s</b> wrote:' % greeting.author.nickname())
251                else:
252                    self.response.out.write('An anonymous person wrote:')
253                self.response.out.write('<blockquote>%s</blockquote>' %
254                                        cgi.escape(greeting.content))
255
256            # Write the submission form and the footer of the page
257            self.response.out.write("""
258                  <form action="/sign" method="post">
259                    <div><textarea name="content" rows="3" cols="60"></textarea></div>
260                    <div><input type="submit" value="Sign Guestbook"></div>
261                  </form>
262                </body>
263              </html>""")
264
265The query happens here::
266
267    greetings = db.GqlQuery("SELECT * "
268                            "FROM Greeting "
269                            "WHERE ANCESTOR IS :1 "
270                            "ORDER BY date DESC LIMIT 10",
271                             guestbook_key(guestbook_name))
272
273Alternatively, you can call the ``gql(...)`` method on the ``Greeting`` class,
274and omit the ``SELECT * FROM Greeting`` from the query::
275
276    greetings = Greeting.gql("WHERE ANCESTOR IS :1 ORDER BY date DESC LIMIT 10",
277                             guestbook_key(guestbook_name))
278
279As with SQL, keywords (such as ``SELECT``) are case insensitive. Names,
280however, are case sensitive.
281
282Because the query returns full data objects, it does not make sense to select
283specific properties from the model. All GQL queries start with
284``SELECT * FROM model`` (or are so implied by the model's ``gql(...)`` method)
285so as to resemble their SQL equivalents.
286
287A GQL query can have a ``WHERE`` clause that filters the result set by one or
288more conditions based on property values. Unlike SQL, GQL queries may not
289contain value constants: Instead, GQL uses parameter binding for all values
290in queries. For example, to get only the greetings posted by the current user::
291
292    if users.get_current_user():
293        greetings = Greeting.gql(
294            "WHERE ANCESTOR IS :1 AND author = :2 ORDER BY date DESC",
295            guestbook_key(guestbook_name), users.get_current_user())
296
297You can also use named parameters instead of positional parameters::
298
299    greetings = Greeting.gql("WHERE ANCESTOR = :ancestor AND author = :author ORDER BY date DESC",
300                             ancestor=guestbook_key(guestbook_name), author=users.get_current_user())
301
302In addition to GQL, the datastore API provides another mechanism for building
303query objects using methods. The query above could also be prepared as follows::
304
305    greetings = Greeting.all()
306    greetings.ancestor(guestbook_key(guestbook_name))
307    greetings.filter("author =", users.get_current_user())
308    greetings.order("-date")
309
310For a complete description of GQL and the query APIs, see the `Datastore reference <http://code.google.com/appengine/docs/python/datastore/>`_.
311
312
313Clearing the Development Server Datastore
314-----------------------------------------
315The development web server uses a local version of the datastore for testing
316your application, using temporary files. The data persists as long as the
317temporary files exist, and the web server does not reset these files unless
318you ask it to do so.
319
320If you want the development server to erase its datastore prior to starting up,
321use the ``--clear_datastore`` option when starting the server:
322
323.. code-block:: text
324
325   dev_appserver.py --clear_datastore helloworld/
326
327
328Next...
329-------
330We now have a working guest book application that authenticates users using
331Google accounts, lets them submit messages, and displays messages other users
332have left. Because App Engine handles scaling automatically, we will not need
333to revisit this code as our application gets popular.
334
335This latest version mixes HTML content with the code for the ``MainPage``
336handler. This will make it difficult to change the appearance of the application,
337especially as our application gets bigger and more complex. Let's use
338templates to manage the appearance, and introduce static files for a CSS
339stylesheet.
340
341Continue to :ref:`tutorials.gettingstarted.templates`.
342