page.title=Specifying App Content for Indexing trainingnavtop=true @jd:body
Google's web crawling bot (Googlebot), which crawls and indexes web sites for the Google search engine, can also index content in your Android app. By opting in, you can allow Googlebot to crawl the content in the APK through the Google Play Store to index the app content. To indicate which app content you’d like Google to index, simply add link elements either to your existing Sitemap file or in the {@code <head>} element of each web page in your site, in the same way as you would for web pages.
The deep links that you share with Google Search must take this URI format:
android-app://<package_name>/<scheme>/<host_path>
The components that make up the URI format are:
The following sections describe how to add a deep link URI to your Sitemap or web pages.
To annotate the deep link for Google Search app indexing in your Sitemap, use the {@code <xhtml:link>} tag and specify the deep link as an alternate URI.
For example, the following XML snippet shows how you might specify a link to your web page by using the {@code <loc>} tag, and a corresponding deep link to your Android app by using the {@code <xhtml:link>} tag.
<?xml version="1.0" encoding="UTF-8" ?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml"> <url> <loc>example://gizmos</loc> <xhtml:link rel="alternate" href="android-app://com.example.android/example/gizmos" /> </url> ... </urlset>
Instead of specifying the deep links for Google Search app indexing in your Sitemap file, you can annotate the deep links in the HTML markup of your web pages. You can do this in the {@code <head>} section for each web page by adding a {@code <link>} tag and specifying the deep link as an alternate URI.
For example, the following HTML snippet shows how you might specify the corresponding deep link in a web page that has the URL {@code example://gizmos}.
<html> <head> <link rel="alternate" href="android-app://com.example.android/example/gizmos" /> ... </head> <body> ... </body>
Typically, you control how Googlebot crawls publicly accessible URLs on your site by using a {@code robots.txt} file. When Googlebot indexes your app content, your app might make HTTP requests as part of its normal operations. However, these requests will appear to your servers as originating from Googlebot. Therefore, you must configure your server's {@code robots.txt} file properly to allow these requests.
For example, the following {@code robots.txt} directive shows how you might allow access to a specific directory in your web site (for example, {@code /api/}) that your app needs to access, while restricting Googlebot's access to other parts of your site.
User-Agent: Googlebot Allow: /api/ Disallow: /
To learn more about how to modify {@code robots.txt} to control web crawling, see the Controlling Crawling and Indexing Getting Started guide.