1.. _tutorials.gettingstarted.usingdatastore: 2 3Using the Datastore 4=================== 5Storing data in a scalable web application can be tricky. A user could be 6interacting with any of dozens of web servers at a given time, and the user's 7next request could go to a different web server than the one that handled the 8previous request. All web servers need to be interacting with data that is 9also spread out across dozens of machines, possibly in different locations 10around the world. 11 12Thanks to Google App Engine, you don't have to worry about any of that. 13App Engine's infrastructure takes care of all of the distribution, replication 14and load balancing of data behind a simple API -- and you get a powerful 15query engine and transactions as well. 16 17The default datastore for an application is now the `High Replication datastore <http://code.google.com/appengine/docs/python/datastore/hr/>`_. 18This datastore uses the `Paxos algorithm <http://labs.google.com/papers/paxos_made_live.html>`_ 19to replicate data across datacenters. The High Replication datastore is 20extremely resilient in the face of catastrophic failure. 21 22One of the consequences of this is that the consistency guarantee for the 23datastore may differ from what you are familiar with. It also differs slightly 24from the Master/Slave datastore, the other datastore option that App Engine 25offers. In the example code comments, we highlight some ways this might affect 26the design of your app. For more detailed information, 27see `Using the High Replication Datastore <http://code.google.com/appengine/docs/python/datastore/hr/overview.html>`_ 28(HRD). 29 30The datastore writes data in objects known as entities, and each entity has a 31key that identifies the entity. Entities can belong to the same entity group, 32which allows you to perform a single transaction with multiple entities. 33Entity groups have a parent key that identifies the entire entity group. 34 35In the High Replication Datastore, entity groups are also a unit of 36consistency. Queries over multiple entity groups may return stale, `eventually consistent <http://en.wikipedia.org/wiki/Eventual_consistency>`_ 37results. Queries over a single entity group return up-to-date, strongly 38consistent, results. Queries over a single entity group are called ancestor 39queries. Ancestor queries use the parent key (instead of a specific entity's 40key). 41 42The code samples in this guide organize like entities into entity groups, and 43use ancestor queries on those entity groups to return strongly consistent 44results. In the example code comments, we highlight some ways this might affect 45the design of your app. For more detailed information, 46see `Using the High Replication Datastore <http://code.google.com/appengine/docs/python/datastore/hr/overview.html>`_. 47 48 49A Complete Example Using the Datastore 50-------------------------------------- 51Here is a new version of ``helloworld/helloworld.py`` that stores greetings 52in the datastore. The rest of this page discusses the new pieces:: 53 54 import cgi 55 import datetime 56 import urllib 57 import wsgiref.handlers 58 59 from google.appengine.ext import db 60 from google.appengine.api import users 61 import webapp2 62 63 64 class Greeting(db.Model): 65 """Models an individual Guestbook entry with an author, content, and date.""" 66 author = db.UserProperty() 67 content = db.StringProperty(multiline=True) 68 date = db.DateTimeProperty(auto_now_add=True) 69 70 71 def guestbook_key(guestbook_name=None): 72 """Constructs a datastore key for a Guestbook entity with guestbook_name.""" 73 return db.Key.from_path('Guestbook', guestbook_name or 'default_guestbook') 74 75 76 class MainPage(webapp2.RequestHandler): 77 def get(self): 78 self.response.out.write('<html><body>') 79 guestbook_name=self.request.get('guestbook_name') 80 81 # Ancestor Queries, as shown here, are strongly consistent with the High 82 # Replication datastore. Queries that span entity groups are eventually 83 # consistent. If we omitted the ancestor from this query there would be a 84 # slight chance that Greeting that had just been written would not show up 85 # in a query. 86 greetings = db.GqlQuery("SELECT * " 87 "FROM Greeting " 88 "WHERE ANCESTOR IS :1 " 89 "ORDER BY date DESC LIMIT 10", 90 guestbook_key(guestbook_name)) 91 92 for greeting in greetings: 93 if greeting.author: 94 self.response.out.write( 95 '<b>%s</b> wrote:' % greeting.author.nickname()) 96 else: 97 self.response.out.write('An anonymous person wrote:') 98 self.response.out.write('<blockquote>%s</blockquote>' % 99 cgi.escape(greeting.content)) 100 101 self.response.out.write(""" 102 <form action="/sign?%s" method="post"> 103 <div><textarea name="content" rows="3" cols="60"></textarea></div> 104 <div><input type="submit" value="Sign Guestbook"></div> 105 </form> 106 <hr> 107 <form>Guestbook name: <input value="%s" name="guestbook_name"> 108 <input type="submit" value="switch"></form> 109 </body> 110 </html>""" % (urllib.urlencode({'guestbook_name': guestbook_name}), 111 cgi.escape(guestbook_name))) 112 113 114 class Guestbook(webapp2.RequestHandler): 115 def post(self): 116 # We set the same parent key on the 'Greeting' to ensure each greeting is in 117 # the same entity group. Queries across the single entity group will be 118 # consistent. However, the write rate to a single entity group should 119 # be limited to ~1/second. 120 guestbook_name = self.request.get('guestbook_name') 121 greeting = Greeting(parent=guestbook_key(guestbook_name)) 122 123 if users.get_current_user(): 124 greeting.author = users.get_current_user() 125 126 greeting.content = self.request.get('content') 127 greeting.put() 128 self.redirect('/?' + urllib.urlencode({'guestbook_name': guestbook_name})) 129 130 131 application = webapp2.WSGIApplication([ 132 ('/', MainPage), 133 ('/sign', Guestbook) 134 ], debug=True) 135 136 137 def main(): 138 application.RUN() 139 140 141 if __name__ == '__main__': 142 main() 143 144Replace ``helloworld/helloworld.py`` with this, then reload 145`http://localhost:8080/ <http://localhost:8080/>`_ in your browser. Post a 146few messages to verify that messages get stored and displayed correctly. 147 148 149Storing the Submitted Greetings 150------------------------------- 151App Engine includes a data modeling API for Python. It's similar to Django's 152data modeling API, but uses App Engine's scalable datastore behind the scenes. 153 154For the guestbook application, we want to store greetings posted by users. 155Each greeting includes the author's name, the message content, and the date 156and time the message was posted so we can display messages in chronological 157order. 158 159To use the data modeling API, import the ``google.appengine.ext.db`` module:: 160 161 from google.appengine.ext import db 162 163The following defines a data model for a greeting:: 164 165 class Greeting(db.Model): 166 author = db.UserProperty() 167 content = db.StringProperty(multiline=True) 168 date = db.DateTimeProperty(auto_now_add=True) 169 170This defines a ``Greeting`` model with three properties: ``author`` whose 171value is a ``User`` object, content whose value is a string, and ``date`` whose 172value is a ``datetime.datetime``. 173 174Some property constructors take parameters to further configure their behavior. 175Giving the ``db.StringProperty`` constructor the ``multiline=True`` parameter 176says that values for this property can contain newline characters. Giving the 177``db.DateTimeProperty`` constructor a ``auto_now_add=True`` parameter 178configures the model to automatically give new objects a ``date`` of the time 179the object is created, if the application doesn't otherwise provide a value. 180For a complete list of property types and their options, see `the Datastore reference <http://code.google.com/appengine/docs/python/datastore/>`_. 181 182Now that we have a data model for greetings, the application can use the model 183to create new ``Greeting`` objects and put them into the datastore. The following 184new version of the ``Guestbook`` handler creates new greetings and saves them 185to the datastore:: 186 187 class Guestbook(webapp2.RequestHandler): 188 def post(self): 189 guestbook_name = self.request.get('guestbook_name') 190 greeting = Greeting(parent=guestbook_key(guestbook_name)) 191 192 if users.get_current_user(): 193 greeting.author = users.get_current_user() 194 195 greeting.content = self.request.get('content') 196 greeting.put() 197 self.redirect('/?' + urllib.urlencode({'guestbook_name': guestbook_name})) 198 199This new ``Guestbook`` handler creates a new ``Greeting`` object, then sets its 200``author`` and ``content`` properties with the data posted by the user. 201The parent has an entity kind "Guestbook". There is no need to create the 202"Guestbook" entity before setting it to be the parent of another entity. In 203this example, the parent is used as a placeholder for transaction and 204consistency purposes. See `Entity Groups and Ancestor Paths <http://code.google.com/appengine/docs/python/datastore/entities.html#Entity_Groups_and_Ancestor_Paths>`_ 205for more information. Objects that share a common `ancestor <http://code.google.com/appengine/docs/python/datastore/queryclass.html#Query_ancestor>`_ 206belong to the same entity group. It does not set the date property, so date is 207automatically set to "now," as we configured the model to do. 208 209Finally, ``greeting.put()`` saves our new object to the datastore. If we had 210acquired this object from a query, ``put()`` would have updated the existing 211object. Since we created this object with the model constructor, ``put()`` adds 212the new object to the datastore. 213 214Because querying in the High Replication datastore is only strongly consistent 215within entity groups, we assign all Greetings to the same entity group in this 216example by setting the same parent for each Greeting. This means a user will 217always see a Greeting immediately after it was written. However, the rate at 218which you can write to the same entity group is limited to 1 write to the 219entity group per second. When you design a real application you'll need to 220keep this fact in mind. Note that by using services such as `Memcache <http://code.google.com/appengine/docs/python/memcache/>`_, 221you can mitigate the chance that a user won't see fresh results when querying 222across entity groups immediately after a write. 223 224 225Retrieving the Stored Greetings With GQL 226---------------------------------------- 227The App Engine datastore has a sophisticated query engine for data models. 228Because the App Engine datastore is not a traditional relational database, 229queries are not specified using SQL. Instead, you can prepare queries using a 230SQL-like query language we call GQL. GQL provides access to the App Engine 231datastore query engine's features using a familiar syntax. 232 233The following new version of the ``MainPage`` handler queries the datastore 234for greetings:: 235 236 class MainPage(webapp2.RequestHandler): 237 def get(self): 238 self.response.out.write('<html><body>') 239 guestbook_name=self.request.get('guestbook_name') 240 241 greetings = db.GqlQuery("SELECT * " 242 "FROM Greeting " 243 "WHERE ANCESTOR IS :1 " 244 "ORDER BY date DESC LIMIT 10", 245 guestbook_key(guestbook_name)) 246 247 248 for greeting in greetings: 249 if greeting.author: 250 self.response.out.write('<b>%s</b> wrote:' % greeting.author.nickname()) 251 else: 252 self.response.out.write('An anonymous person wrote:') 253 self.response.out.write('<blockquote>%s</blockquote>' % 254 cgi.escape(greeting.content)) 255 256 # Write the submission form and the footer of the page 257 self.response.out.write(""" 258 <form action="/sign" method="post"> 259 <div><textarea name="content" rows="3" cols="60"></textarea></div> 260 <div><input type="submit" value="Sign Guestbook"></div> 261 </form> 262 </body> 263 </html>""") 264 265The query happens here:: 266 267 greetings = db.GqlQuery("SELECT * " 268 "FROM Greeting " 269 "WHERE ANCESTOR IS :1 " 270 "ORDER BY date DESC LIMIT 10", 271 guestbook_key(guestbook_name)) 272 273Alternatively, you can call the ``gql(...)`` method on the ``Greeting`` class, 274and omit the ``SELECT * FROM Greeting`` from the query:: 275 276 greetings = Greeting.gql("WHERE ANCESTOR IS :1 ORDER BY date DESC LIMIT 10", 277 guestbook_key(guestbook_name)) 278 279As with SQL, keywords (such as ``SELECT``) are case insensitive. Names, 280however, are case sensitive. 281 282Because the query returns full data objects, it does not make sense to select 283specific properties from the model. All GQL queries start with 284``SELECT * FROM model`` (or are so implied by the model's ``gql(...)`` method) 285so as to resemble their SQL equivalents. 286 287A GQL query can have a ``WHERE`` clause that filters the result set by one or 288more conditions based on property values. Unlike SQL, GQL queries may not 289contain value constants: Instead, GQL uses parameter binding for all values 290in queries. For example, to get only the greetings posted by the current user:: 291 292 if users.get_current_user(): 293 greetings = Greeting.gql( 294 "WHERE ANCESTOR IS :1 AND author = :2 ORDER BY date DESC", 295 guestbook_key(guestbook_name), users.get_current_user()) 296 297You can also use named parameters instead of positional parameters:: 298 299 greetings = Greeting.gql("WHERE ANCESTOR = :ancestor AND author = :author ORDER BY date DESC", 300 ancestor=guestbook_key(guestbook_name), author=users.get_current_user()) 301 302In addition to GQL, the datastore API provides another mechanism for building 303query objects using methods. The query above could also be prepared as follows:: 304 305 greetings = Greeting.all() 306 greetings.ancestor(guestbook_key(guestbook_name)) 307 greetings.filter("author =", users.get_current_user()) 308 greetings.order("-date") 309 310For a complete description of GQL and the query APIs, see the `Datastore reference <http://code.google.com/appengine/docs/python/datastore/>`_. 311 312 313Clearing the Development Server Datastore 314----------------------------------------- 315The development web server uses a local version of the datastore for testing 316your application, using temporary files. The data persists as long as the 317temporary files exist, and the web server does not reset these files unless 318you ask it to do so. 319 320If you want the development server to erase its datastore prior to starting up, 321use the ``--clear_datastore`` option when starting the server: 322 323.. code-block:: text 324 325 dev_appserver.py --clear_datastore helloworld/ 326 327 328Next... 329------- 330We now have a working guest book application that authenticates users using 331Google accounts, lets them submit messages, and displays messages other users 332have left. Because App Engine handles scaling automatically, we will not need 333to revisit this code as our application gets popular. 334 335This latest version mixes HTML content with the code for the ``MainPage`` 336handler. This will make it difficult to change the appearance of the application, 337especially as our application gets bigger and more complex. Let's use 338templates to manage the appearance, and introduce static files for a CSS 339stylesheet. 340 341Continue to :ref:`tutorials.gettingstarted.templates`. 342