1# -*- coding: utf-8 -*- 2# Copyright 2012 Google Inc. All Rights Reserved. 3# 4# Licensed under the Apache License, Version 2.0 (the "License"); 5# you may not use this file except in compliance with the License. 6# You may obtain a copy of the License at 7# 8# http://www.apache.org/licenses/LICENSE-2.0 9# 10# Unless required by applicable law or agreed to in writing, software 11# distributed under the License is distributed on an "AS IS" BASIS, 12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13# See the License for the specific language governing permissions and 14# limitations under the License. 15"""Additional help about object metadata.""" 16 17from __future__ import absolute_import 18 19from gslib.help_provider import HelpProvider 20 21_DETAILED_HELP_TEXT = (""" 22<B>OVERVIEW OF METADATA</B> 23 Objects can have associated metadata, which control aspects of how 24 GET requests are handled, including Content-Type, Cache-Control, 25 Content-Disposition, and Content-Encoding (discussed in more detail in 26 the subsections below). In addition, you can set custom metadata that 27 can be used by applications (e.g., tagging that particular objects possess 28 some property). 29 30 There are two ways to set metadata on objects: 31 32 - at upload time you can specify one or more headers to associate with 33 objects, using the gsutil -h option. For example, the following command 34 would cause gsutil to set the Content-Type and Cache-Control for each 35 of the files being uploaded: 36 37 gsutil -h "Content-Type:text/html" \\ 38 -h "Cache-Control:public, max-age=3600" cp -r images \\ 39 gs://bucket/images 40 41 Note that -h is an option on the gsutil command, not the cp sub-command. 42 43 - You can set or remove metadata fields from already uploaded objects using 44 the gsutil setmeta command. See "gsutil help setmeta". 45 46 More details about specific pieces of metadata are discussed below. 47 48 49<B>CONTENT TYPE</B> 50 The most commonly set metadata is Content-Type (also known as MIME type), 51 which allows browsers to render the object properly. 52 gsutil sets the Content-Type automatically at upload time, based on each 53 filename extension. For example, uploading files with names ending in .txt 54 will set Content-Type to text/plain. If you're running gsutil on Linux or 55 MacOS and would prefer to have content type set based on naming plus content 56 examination, see the use_magicfile configuration variable in the gsutil/boto 57 configuration file (See also "gsutil help config"). In general, using 58 use_magicfile is more robust and configurable, but is not available on 59 Windows. 60 61 If you specify a Content-Type header with -h when uploading content (like the 62 example gsutil command given in the previous section), it overrides the 63 Content-Type that would have been set based on filename extension or content. 64 This can be useful if the Content-Type detection algorithm doesn't work as 65 desired for some of your files. 66 67 You can also completely suppress content type detection in gsutil, by 68 specifying an empty string on the Content-Type header: 69 70 gsutil -h 'Content-Type:' cp -r images gs://bucket/images 71 72 In this case, the Google Cloud Storage service will not attempt to detect 73 the content type. In general this approach will work better than using 74 filename extension-based content detection in gsutil, because the list of 75 filename extensions is kept more current in the server-side content detection 76 system than in the Python library upon which gsutil content type detection 77 depends. (For example, at the time of writing this, the filename extension 78 ".webp" was recognized by the server-side content detection system, but 79 not by gsutil.) 80 81 82<B>CACHE-CONTROL</B> 83 Another commonly set piece of metadata is Cache-Control, which allows 84 you to control whether and for how long browser and Internet caches are 85 allowed to cache your objects. Cache-Control only applies to objects with 86 a public-read ACL. Non-public data are not cacheable. 87 88 Here's an example of uploading a set of objects to allow caching: 89 90 gsutil -h "Cache-Control:public,max-age=3600" cp -a public-read \\ 91 -r html gs://bucket/html 92 93 This command would upload all files in the html directory (and subdirectories) 94 and make them publicly readable and cacheable, with cache expiration of 95 one hour. 96 97 Note that if you allow caching, at download time you may see older versions 98 of objects after uploading a newer replacement object. Note also that because 99 objects can be cached at various places on the Internet there is no way to 100 force a cached object to expire globally (unlike the way you can force your 101 browser to refresh its cache). If you want to prevent caching of publicly 102 readable objects you should set a Cache-Control:private header on the object. 103 You can do this with a command such as: 104 105 gsutil -h Cache-Control:private cp -a public-read file.png gs://your-bucket 106 107 Another use of the Cache-Control header is through the "no-transform" value, 108 which instructs Google Cloud Storage to not apply any content transformations 109 based on specifics of a download request, such as removing gzip 110 content-encoding for incompatible clients. Note that this parameter is only 111 respected by the XML API. The Google Cloud Storage JSON API respects only the 112 no-cache and max-age Cache-Control parameters. 113 114 For details about how to set the Cache-Control header see 115 "gsutil help setmeta". 116 117 118<B>CONTENT-ENCODING</B> 119 You can specify a Content-Encoding to indicate that an object is compressed 120 (for example, with gzip compression) while maintaining its Content-Type. 121 You will need to ensure that the files have been compressed using the 122 specified Content-Encoding before using gsutil to upload them. Consider the 123 following example for Linux: 124 125 echo "Highly compressible text" | gzip > foo.txt 126 gsutil -h "Content-Encoding:gzip" -h "Content-Type:text/plain" \\ 127 cp foo.txt gs://bucket/compressed 128 129 Note that this is different from uploading a gzipped object foo.txt.gz with 130 Content-Type: application/x-gzip because most browsers are able to 131 dynamically decompress and process objects served with Content-Encoding: gzip 132 based on the underlying Content-Type. 133 134 For compressible content, using Content-Encoding: gzip saves network and 135 storage costs, and improves content serving performance. However, for content 136 that is already inherently compressed (archives and many media formats, for 137 instance) applying another level of compression via Content-Encoding is 138 typically detrimental to both object size and performance and should be 139 avoided. 140 141 Note also that gsutil provides an easy way to cause content to be compressed 142 and stored with Content-Encoding: gzip: see the -z option in "gsutil help cp". 143 144 145<B>CONTENT-DISPOSITION</B> 146 You can set Content-Disposition on your objects, to specify presentation 147 information about the data being transmitted. Here's an example: 148 149 gsutil -h 'Content-Disposition:attachment; filename=filename.ext' \\ 150 cp -r attachments gs://bucket/attachments 151 152 Setting the Content-Disposition allows you to control presentation style 153 of the content, for example determining whether an attachment should be 154 automatically displayed vs should require some form of action from the user to 155 open it. See http://www.w3.org/Protocols/rfc2616/rfc2616-sec19.html#sec19.5.1 156 for more details about the meaning of Content-Disposition. 157 158 159<B>CUSTOM METADATA</B> 160 You can add your own custom metadata (e.g,. for use by your application) 161 to an object by setting a header that starts with "x-goog-meta", for example: 162 163 gsutil -h x-goog-meta-reviewer:jane cp mycode.java gs://bucket/reviews 164 165 You can add multiple differently named custom metadata fields to each object. 166 167 168<B>SETTABLE FIELDS; FIELD VALUES</B> 169 You can't set some metadata fields, such as ETag and Content-Length. The 170 fields you can set are: 171 172 - Cache-Control 173 - Content-Disposition 174 - Content-Encoding 175 - Content-Language 176 - Content-MD5 177 - Content-Type 178 - Any field starting with a matching Cloud Storage Provider 179 prefix, such as x-goog-meta- (i.e., custom metadata). 180 181 Header names are case-insensitive. 182 183 x-goog-meta- fields can have data set to arbitrary Unicode values. All 184 other fields must have ASCII values. 185 186 187<B>VIEWING CURRENTLY SET METADATA</B> 188 You can see what metadata is currently set on an object by using: 189 190 gsutil ls -L gs://the_bucket/the_object 191""") 192 193 194class CommandOptions(HelpProvider): 195 """Additional help about object metadata.""" 196 197 # Help specification. See help_provider.py for documentation. 198 help_spec = HelpProvider.HelpSpec( 199 help_name='metadata', 200 help_name_aliases=[ 201 'cache-control', 'caching', 'content type', 'mime type', 'mime', 202 'type'], 203 help_type='additional_help', 204 help_one_line_summary='Working With Object Metadata', 205 help_text=_DETAILED_HELP_TEXT, 206 subcommand_help_text={}, 207 ) 208