1<html><body> 2<style> 3 4body, h1, h2, h3, div, span, p, pre, a { 5 margin: 0; 6 padding: 0; 7 border: 0; 8 font-weight: inherit; 9 font-style: inherit; 10 font-size: 100%; 11 font-family: inherit; 12 vertical-align: baseline; 13} 14 15body { 16 font-size: 13px; 17 padding: 1em; 18} 19 20h1 { 21 font-size: 26px; 22 margin-bottom: 1em; 23} 24 25h2 { 26 font-size: 24px; 27 margin-bottom: 1em; 28} 29 30h3 { 31 font-size: 20px; 32 margin-bottom: 1em; 33 margin-top: 1em; 34} 35 36pre, code { 37 line-height: 1.5; 38 font-family: Monaco, 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Lucida Console', monospace; 39} 40 41pre { 42 margin-top: 0.5em; 43} 44 45h1, h2, h3, p { 46 font-family: Arial, sans serif; 47} 48 49h1, h2, h3 { 50 border-bottom: solid #CCC 1px; 51} 52 53.toc_element { 54 margin-top: 0.5em; 55} 56 57.firstline { 58 margin-left: 2 em; 59} 60 61.method { 62 margin-top: 1em; 63 border: solid 1px #CCC; 64 padding: 1em; 65 background: #EEE; 66} 67 68.details { 69 font-weight: bold; 70 font-size: 14px; 71} 72 73</style> 74 75<h1><a href="dlp_v2.html">Cloud Data Loss Prevention (DLP) API</a> . <a href="dlp_v2.projects.html">projects</a> . <a href="dlp_v2.projects.dlpJobs.html">dlpJobs</a></h1> 76<h2>Instance Methods</h2> 77<p class="toc_element"> 78 <code><a href="#cancel">cancel(name, body=None, x__xgafv=None)</a></code></p> 79<p class="firstline">Starts asynchronous cancellation on a long-running DlpJob. The server</p> 80<p class="toc_element"> 81 <code><a href="#create">create(parent, body, x__xgafv=None)</a></code></p> 82<p class="firstline">Creates a new job to inspect storage or calculate risk metrics.</p> 83<p class="toc_element"> 84 <code><a href="#delete">delete(name, x__xgafv=None)</a></code></p> 85<p class="firstline">Deletes a long-running DlpJob. This method indicates that the client is</p> 86<p class="toc_element"> 87 <code><a href="#get">get(name, x__xgafv=None)</a></code></p> 88<p class="firstline">Gets the latest state of a long-running DlpJob.</p> 89<p class="toc_element"> 90 <code><a href="#list">list(parent, orderBy=None, type=None, pageSize=None, pageToken=None, x__xgafv=None, filter=None)</a></code></p> 91<p class="firstline">Lists DlpJobs that match the specified filter in the request.</p> 92<p class="toc_element"> 93 <code><a href="#list_next">list_next(previous_request, previous_response)</a></code></p> 94<p class="firstline">Retrieves the next page of results.</p> 95<h3>Method Details</h3> 96<div class="method"> 97 <code class="details" id="cancel">cancel(name, body=None, x__xgafv=None)</code> 98 <pre>Starts asynchronous cancellation on a long-running DlpJob. The server 99makes a best effort to cancel the DlpJob, but success is not 100guaranteed. 101See https://cloud.google.com/dlp/docs/inspecting-storage and 102https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more. 103 104Args: 105 name: string, The name of the DlpJob resource to be cancelled. (required) 106 body: object, The request body. 107 The object takes the form of: 108 109{ # The request message for canceling a DLP job. 110 } 111 112 x__xgafv: string, V1 error format. 113 Allowed values 114 1 - v1 error format 115 2 - v2 error format 116 117Returns: 118 An object of the form: 119 120 { # A generic empty message that you can re-use to avoid defining duplicated 121 # empty messages in your APIs. A typical example is to use it as the request 122 # or the response type of an API method. For instance: 123 # 124 # service Foo { 125 # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); 126 # } 127 # 128 # The JSON representation for `Empty` is empty JSON object `{}`. 129 }</pre> 130</div> 131 132<div class="method"> 133 <code class="details" id="create">create(parent, body, x__xgafv=None)</code> 134 <pre>Creates a new job to inspect storage or calculate risk metrics. 135See https://cloud.google.com/dlp/docs/inspecting-storage and 136https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more. 137 138When no InfoTypes or CustomInfoTypes are specified in inspect jobs, the 139system will automatically choose what detectors to run. By default this may 140be all types, but may change over time as detectors are updated. 141 142Args: 143 parent: string, The parent resource name, for example projects/my-project-id. (required) 144 body: object, The request body. (required) 145 The object takes the form of: 146 147{ # Request message for CreateDlpJobRequest. Used to initiate long running 148 # jobs such as calculating risk metrics or inspecting Google Cloud 149 # Storage. 150 "riskJob": { # Configuration for a risk analysis job. See 151 # https://cloud.google.com/dlp/docs/concepts-risk-analysis to learn more. 152 "privacyMetric": { # Privacy metric to compute for reidentification risk analysis. # Privacy metric to compute. 153 "numericalStatsConfig": { # Compute numerical stats over an individual column, including 154 # min, max, and quantiles. 155 "field": { # General identifier of a data field in a storage service. # Field to compute numerical stats on. Supported types are 156 # integer, float, date, datetime, timestamp, time. 157 "name": "A String", # Name describing the field. 158 }, 159 }, 160 "kMapEstimationConfig": { # Reidentifiability metric. This corresponds to a risk model similar to what 161 # is called "journalist risk" in the literature, except the attack dataset is 162 # statistically modeled instead of being perfectly known. This can be done 163 # using publicly available data (like the US Census), or using a custom 164 # statistical model (indicated as one or several BigQuery tables), or by 165 # extrapolating from the distribution of values in the input dataset. 166 # A column with a semantic tag attached. 167 "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling. 168 # Required if no column is tagged with a region-specific InfoType (like 169 # US_ZIP_5) or a region code. 170 "quasiIds": [ # Fields considered to be quasi-identifiers. No two columns can have the 171 # same tag. [required] 172 { 173 "field": { # General identifier of a data field in a storage service. # Identifies the column. [required] 174 "name": "A String", # Name describing the field. 175 }, 176 "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must 177 # indicate an auxiliary table that contains statistical information on 178 # the possible values of this column (below). 179 "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public 180 # dataset as a statistical model of population, if available. We 181 # currently support US ZIP codes, region codes, ages and genders. 182 # To programmatically obtain the list of supported InfoTypes, use 183 # ListInfoTypes with the supported_by=RISK_ANALYSIS filter. 184 "name": "A String", # Name of the information type. Either a name of your choosing when 185 # creating a CustomInfoType, or one of the names listed 186 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 187 # a built-in type. InfoType names should conform to the pattern 188 # [a-zA-Z0-9_]{1,64}. 189 }, 190 "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from 191 # the distribution of values in the input data 192 # empty messages in your APIs. A typical example is to use it as the request 193 # or the response type of an API method. For instance: 194 # 195 # service Foo { 196 # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); 197 # } 198 # 199 # The JSON representation for `Empty` is empty JSON object `{}`. 200 }, 201 }, 202 ], 203 "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag 204 # used to tag a quasi-identifiers column must appear in exactly one column 205 # of one auxiliary table. 206 { # An auxiliary table contains statistical information on the relative 207 # frequency of different quasi-identifiers values. It has one or several 208 # quasi-identifiers columns, and one column that indicates the relative 209 # frequency of each quasi-identifier tuple. 210 # If a tuple is present in the data but not in the auxiliary table, the 211 # corresponding relative frequency is assumed to be zero (and thus, the 212 # tuple is highly reidentifiable). 213 "relativeFrequency": { # General identifier of a data field in a storage service. # The relative frequency column must contain a floating-point number 214 # between 0 and 1 (inclusive). Null values are assumed to be zero. 215 # [required] 216 "name": "A String", # Name describing the field. 217 }, 218 "quasiIds": [ # Quasi-identifier columns. [required] 219 { # A quasi-identifier column has a custom_tag, used to know which column 220 # in the data corresponds to which column in the statistical model. 221 "field": { # General identifier of a data field in a storage service. 222 "name": "A String", # Name describing the field. 223 }, 224 "customTag": "A String", 225 }, 226 ], 227 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Auxiliary table location. [required] 228 # identified by its project_id, dataset_id, and table_name. Within a query 229 # a table is often referenced with a string in the format of: 230 # `<project_id>:<dataset_id>.<table_id>` or 231 # `<project_id>.<dataset_id>.<table_id>`. 232 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. 233 # If omitted, project ID is inferred from the API call. 234 "tableId": "A String", # Name of the table. 235 "datasetId": "A String", # Dataset ID of the table. 236 }, 237 }, 238 ], 239 }, 240 "lDiversityConfig": { # l-diversity metric, used for analysis of reidentification risk. 241 "sensitiveAttribute": { # General identifier of a data field in a storage service. # Sensitive field for computing the l-value. 242 "name": "A String", # Name describing the field. 243 }, 244 "quasiIds": [ # Set of quasi-identifiers indicating how equivalence classes are 245 # defined for the l-diversity computation. When multiple fields are 246 # specified, they are considered a single composite key. 247 { # General identifier of a data field in a storage service. 248 "name": "A String", # Name describing the field. 249 }, 250 ], 251 }, 252 "deltaPresenceEstimationConfig": { # δ-presence metric, used to estimate how likely it is for an attacker to 253 # figure out that one given individual appears in a de-identified dataset. 254 # Similarly to the k-map metric, we cannot compute δ-presence exactly without 255 # knowing the attack dataset, so we use a statistical model instead. 256 "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling. 257 # Required if no column is tagged with a region-specific InfoType (like 258 # US_ZIP_5) or a region code. 259 "quasiIds": [ # Fields considered to be quasi-identifiers. No two fields can have the 260 # same tag. [required] 261 { # A column with a semantic tag attached. 262 "field": { # General identifier of a data field in a storage service. # Identifies the column. [required] 263 "name": "A String", # Name describing the field. 264 }, 265 "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must 266 # indicate an auxiliary table that contains statistical information on 267 # the possible values of this column (below). 268 "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public 269 # dataset as a statistical model of population, if available. We 270 # currently support US ZIP codes, region codes, ages and genders. 271 # To programmatically obtain the list of supported InfoTypes, use 272 # ListInfoTypes with the supported_by=RISK_ANALYSIS filter. 273 "name": "A String", # Name of the information type. Either a name of your choosing when 274 # creating a CustomInfoType, or one of the names listed 275 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 276 # a built-in type. InfoType names should conform to the pattern 277 # [a-zA-Z0-9_]{1,64}. 278 }, 279 "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from 280 # the distribution of values in the input data 281 # empty messages in your APIs. A typical example is to use it as the request 282 # or the response type of an API method. For instance: 283 # 284 # service Foo { 285 # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); 286 # } 287 # 288 # The JSON representation for `Empty` is empty JSON object `{}`. 289 }, 290 }, 291 ], 292 "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag 293 # used to tag a quasi-identifiers field must appear in exactly one 294 # field of one auxiliary table. 295 { # An auxiliary table containing statistical information on the relative 296 # frequency of different quasi-identifiers values. It has one or several 297 # quasi-identifiers columns, and one column that indicates the relative 298 # frequency of each quasi-identifier tuple. 299 # If a tuple is present in the data but not in the auxiliary table, the 300 # corresponding relative frequency is assumed to be zero (and thus, the 301 # tuple is highly reidentifiable). 302 "relativeFrequency": { # General identifier of a data field in a storage service. # The relative frequency column must contain a floating-point number 303 # between 0 and 1 (inclusive). Null values are assumed to be zero. 304 # [required] 305 "name": "A String", # Name describing the field. 306 }, 307 "quasiIds": [ # Quasi-identifier columns. [required] 308 { # A quasi-identifier column has a custom_tag, used to know which column 309 # in the data corresponds to which column in the statistical model. 310 "field": { # General identifier of a data field in a storage service. 311 "name": "A String", # Name describing the field. 312 }, 313 "customTag": "A String", 314 }, 315 ], 316 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Auxiliary table location. [required] 317 # identified by its project_id, dataset_id, and table_name. Within a query 318 # a table is often referenced with a string in the format of: 319 # `<project_id>:<dataset_id>.<table_id>` or 320 # `<project_id>.<dataset_id>.<table_id>`. 321 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. 322 # If omitted, project ID is inferred from the API call. 323 "tableId": "A String", # Name of the table. 324 "datasetId": "A String", # Dataset ID of the table. 325 }, 326 }, 327 ], 328 }, 329 "categoricalStatsConfig": { # Compute numerical stats over an individual column, including 330 # number of distinct values and value count distribution. 331 "field": { # General identifier of a data field in a storage service. # Field to compute categorical stats on. All column types are 332 # supported except for arrays and structs. However, it may be more 333 # informative to use NumericalStats when the field type is supported, 334 # depending on the data. 335 "name": "A String", # Name describing the field. 336 }, 337 }, 338 "kAnonymityConfig": { # k-anonymity metric, used for analysis of reidentification risk. 339 "entityId": { # An entity in a dataset is a field or set of fields that correspond to a # Optional message indicating that multiple rows might be associated to a 340 # single individual. If the same entity_id is associated to multiple 341 # quasi-identifier tuples over distinct rows, we consider the entire 342 # collection of tuples as the composite quasi-identifier. This collection 343 # is a multiset: the order in which the different tuples appear in the 344 # dataset is ignored, but their frequency is taken into account. 345 # 346 # Important note: a maximum of 1000 rows can be associated to a single 347 # entity ID. If more rows are associated with the same entity ID, some 348 # might be ignored. 349 # single person. For example, in medical records the `EntityId` might be a 350 # patient identifier, or for financial records it might be an account 351 # identifier. This message is used when generalizations or analysis must take 352 # into account that multiple rows correspond to the same entity. 353 "field": { # General identifier of a data field in a storage service. # Composite key indicating which field contains the entity identifier. 354 "name": "A String", # Name describing the field. 355 }, 356 }, 357 "quasiIds": [ # Set of fields to compute k-anonymity over. When multiple fields are 358 # specified, they are considered a single composite key. Structs and 359 # repeated data types are not supported; however, nested fields are 360 # supported so long as they are not structs themselves or nested within 361 # a repeated field. 362 { # General identifier of a data field in a storage service. 363 "name": "A String", # Name describing the field. 364 }, 365 ], 366 }, 367 }, 368 "sourceTable": { # Message defining the location of a BigQuery table. A table is uniquely # Input dataset to compute metrics over. 369 # identified by its project_id, dataset_id, and table_name. Within a query 370 # a table is often referenced with a string in the format of: 371 # `<project_id>:<dataset_id>.<table_id>` or 372 # `<project_id>.<dataset_id>.<table_id>`. 373 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. 374 # If omitted, project ID is inferred from the API call. 375 "tableId": "A String", # Name of the table. 376 "datasetId": "A String", # Dataset ID of the table. 377 }, 378 "actions": [ # Actions to execute at the completion of the job. Are executed in the order 379 # provided. 380 { # A task to execute on the completion of a job. 381 # See https://cloud.google.com/dlp/docs/concepts-actions to learn more. 382 "saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location. 383 # OutputStorageConfig. Only a single instance of this action can be 384 # specified. 385 # Compatible with: Inspect, Risk 386 "outputConfig": { # Cloud repository for storing output. 387 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing 388 # dataset. If table_id is not set a new one will be generated 389 # for you with the following format: 390 # dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for 391 # generating the date details. 392 # 393 # For Inspect, each column in an existing output table must have the same 394 # name, type, and mode of a field in the `Finding` object. 395 # 396 # For Risk, an existing output table should be the output of a previous 397 # Risk analysis job run on the same source table, with the same privacy 398 # metric and quasi-identifiers. Risk jobs that analyze the same table but 399 # compute a different privacy metric, or use different sets of 400 # quasi-identifiers, cannot store their results in the same table. 401 # identified by its project_id, dataset_id, and table_name. Within a query 402 # a table is often referenced with a string in the format of: 403 # `<project_id>:<dataset_id>.<table_id>` or 404 # `<project_id>.<dataset_id>.<table_id>`. 405 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. 406 # If omitted, project ID is inferred from the API call. 407 "tableId": "A String", # Name of the table. 408 "datasetId": "A String", # Dataset ID of the table. 409 }, 410 "outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only 411 # used for Inspect and must be unspecified for Risk jobs. Columns are derived 412 # from the `Finding` object. If appending to an existing table, any columns 413 # from the predefined schema that are missing will be added. No columns in 414 # the existing table will be deleted. 415 # 416 # If unspecified, then all available columns will be used for a new table or 417 # an (existing) table with no schema, and no changes will be made to an 418 # existing table that has a schema. 419 }, 420 }, 421 "jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification to project owners and editors on job's 422 # completion/failure. 423 # completion/failure. 424 }, 425 "publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha). 426 # Command Center (CSCC Alpha). 427 # This action is only available for projects which are parts of 428 # an organization and whitelisted for the alpha Cloud Security Command 429 # Center. 430 # The action will publish count of finding instances and their info types. 431 # The summary of findings will be persisted in CSCC and are governed by CSCC 432 # service-specific policy, see https://cloud.google.com/terms/service-terms 433 # Only a single instance of this action can be specified. 434 # Compatible with: Inspect 435 }, 436 "pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic. 437 # message contains a single field, `DlpJobName`, which is equal to the 438 # finished job's 439 # [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob). 440 # Compatible with: Inspect, Risk 441 "topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given 442 # publishing access rights to the DLP API service account executing 443 # the long running DlpJob sending the notifications. 444 # Format is projects/{project}/topics/{topic}. 445 }, 446 }, 447 ], 448 }, 449 "jobId": "A String", # The job id can contain uppercase and lowercase letters, 450 # numbers, and hyphens; that is, it must match the regular 451 # expression: `[a-zA-Z\\d-_]+`. The maximum length is 100 452 # characters. Can be empty to allow the system to generate one. 453 "inspectJob": { 454 "storageConfig": { # Shared message indicating Cloud storage type. # The data to scan. 455 "datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options specification. 456 "partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always 457 # by project and namespace, however the namespace ID may be empty. 458 # A partition ID identifies a grouping of entities. The grouping is always 459 # by project and namespace, however the namespace ID may be empty. 460 # 461 # A partition ID contains several dimensions: 462 # project ID and namespace ID. 463 "projectId": "A String", # The ID of the project to which the entities belong. 464 "namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong. 465 }, 466 "kind": { # A representation of a Datastore kind. # The kind to process. 467 "name": "A String", # The name of the kind. 468 }, 469 }, 470 "bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options specification. 471 "excludedFields": [ # References to fields excluded from scanning. This allows you to skip 472 # inspection of entire columns which you know have no findings. 473 { # General identifier of a data field in a storage service. 474 "name": "A String", # Name describing the field. 475 }, 476 ], 477 "rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the 478 # rest of the rows are omitted. If not set, or if set to 0, all rows will be 479 # scanned. Only one of rows_limit and rows_limit_percent can be specified. 480 # Cannot be used in conjunction with TimespanConfig. 481 "sampleMethod": "A String", 482 "identifyingFields": [ # References to fields uniquely identifying rows within the table. 483 # Nested fields in the format, like `person.birthdate.year`, are allowed. 484 { # General identifier of a data field in a storage service. 485 "name": "A String", # Name describing the field. 486 }, 487 ], 488 "rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows 489 # scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and 490 # 100 means no limit. Defaults to 0. Only one of rows_limit and 491 # rows_limit_percent can be specified. Cannot be used in conjunction with 492 # TimespanConfig. 493 "tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference. 494 # identified by its project_id, dataset_id, and table_name. Within a query 495 # a table is often referenced with a string in the format of: 496 # `<project_id>:<dataset_id>.<table_id>` or 497 # `<project_id>.<dataset_id>.<table_id>`. 498 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. 499 # If omitted, project ID is inferred from the API call. 500 "tableId": "A String", # Name of the table. 501 "datasetId": "A String", # Dataset ID of the table. 502 }, 503 }, 504 "timespanConfig": { # Configuration of the timespan of the items to include in scanning. 505 # Currently only supported when inspecting Google Cloud Storage and BigQuery. 506 "timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items. 507 # Used for data sources like Datastore or BigQuery. 508 # If not specified for BigQuery, table last modification timestamp 509 # is checked against given time span. 510 # The valid data types of the timestamp field are: 511 # for BigQuery - timestamp, date, datetime; 512 # for Datastore - timestamp. 513 # Datastore entity will be scanned if the timestamp property does not exist 514 # or its value is empty or invalid. 515 "name": "A String", # Name describing the field. 516 }, 517 "endTime": "A String", # Exclude files or rows newer than this value. 518 # If set to zero, no upper time limit is applied. 519 "startTime": "A String", # Exclude files or rows older than this value. 520 "enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out 521 # a valid start_time to avoid scanning files that have not been modified 522 # since the last time the JobTrigger executed. This will be based on the 523 # time of the execution of the last run of the JobTrigger. 524 }, 525 "cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options specification. 526 # bucket. 527 "bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger 528 # than this value then the rest of the bytes are omitted. Only one 529 # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified. 530 "sampleMethod": "A String", 531 "fileSet": { # Set of files to scan. # The set of one or more files to scan. 532 "url": "A String", # The Cloud Storage url of the file(s) to scan, in the format 533 # `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed. 534 # 535 # If the url ends in a trailing slash, the bucket or directory represented 536 # by the url will be scanned non-recursively (content in sub-directories 537 # will not be scanned). This means that `gs://mybucket/` is equivalent to 538 # `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to 539 # `gs://mybucket/directory/*`. 540 # 541 # Exactly one of `url` or `regex_file_set` must be set. 542 "regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or 543 # `regex_file_set` must be set. 544 # expressions are used to allow fine-grained control over which files in the 545 # bucket to include. 546 # 547 # Included files are those that match at least one item in `include_regex` and 548 # do not match any items in `exclude_regex`. Note that a file that matches 549 # items from both lists will _not_ be included. For a match to occur, the 550 # entire file path (i.e., everything in the url after the bucket name) must 551 # match the regular expression. 552 # 553 # For example, given the input `{bucket_name: "mybucket", include_regex: 554 # ["directory1/.*"], exclude_regex: 555 # ["directory1/excluded.*"]}`: 556 # 557 # * `gs://mybucket/directory1/myfile` will be included 558 # * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches 559 # across `/`) 560 # * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the 561 # full path doesn't match any items in `include_regex`) 562 # * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path 563 # matches an item in `exclude_regex`) 564 # 565 # If `include_regex` is left empty, it will match all files by default 566 # (this is equivalent to setting `include_regex: [".*"]`). 567 # 568 # Some other common use cases: 569 # 570 # * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all 571 # files in `mybucket` except for .pdf files 572 # * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will 573 # include all files directly under `gs://mybucket/directory/`, without matching 574 # across `/` 575 "excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in 576 # the bucket that match at least one of these regular expressions will be 577 # excluded from the scan. 578 # 579 # Regular expressions use RE2 580 # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found 581 # under the google/re2 repository on GitHub. 582 "A String", 583 ], 584 "bucketName": "A String", # The name of a Cloud Storage bucket. Required. 585 "includeRegex": [ # A list of regular expressions matching file paths to include. All files in 586 # the bucket that match at least one of these regular expressions will be 587 # included in the set of files, except for those that also match an item in 588 # `exclude_regex`. Leaving this field empty will match all files by default 589 # (this is equivalent to including `.*` in the list). 590 # 591 # Regular expressions use RE2 592 # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found 593 # under the google/re2 repository on GitHub. 594 "A String", 595 ], 596 }, 597 }, 598 "bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The 599 # number of bytes scanned is rounded down. Must be between 0 and 100, 600 # inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one 601 # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified. 602 "filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet. 603 # Number of files scanned is rounded down. Must be between 0 and 100, 604 # inclusively. Both 0 and 100 means no limit. Defaults to 0. 605 "fileTypes": [ # List of file type groups to include in the scan. 606 # If empty, all files are scanned and available data format processors 607 # are applied. In addition, the binary content of the selected files 608 # is always scanned as well. 609 "A String", 610 ], 611 }, 612 }, 613 "inspectConfig": { # Configuration description of the scanning process. # How and what to scan for. 614 # When used with redactContent only info_types and min_likelihood are currently 615 # used. 616 "excludeInfoTypes": True or False, # When true, excludes type information of the findings. 617 "limits": { 618 "maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job. 619 # When set within `InspectContentRequest`, the maximum returned is 2000 620 # regardless if this is set higher. 621 "maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes. 622 { # Max findings configuration per infoType, per content item or long 623 # running DlpJob. 624 "infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per 625 # info_type should be provided. If InfoTypeLimit does not have an 626 # info_type, the DLP API applies the limit against all info_types that 627 # are found but not specified in another InfoTypeLimit. 628 "name": "A String", # Name of the information type. Either a name of your choosing when 629 # creating a CustomInfoType, or one of the names listed 630 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 631 # a built-in type. InfoType names should conform to the pattern 632 # [a-zA-Z0-9_]{1,64}. 633 }, 634 "maxFindings": 42, # Max findings limit for the given infoType. 635 }, 636 ], 637 "maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned. 638 # When set within `InspectDataSourceRequest`, 639 # the maximum returned is 2000 regardless if this is set higher. 640 # When set within `InspectContentRequest`, this field is ignored. 641 }, 642 "minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is 643 # POSSIBLE. 644 # See https://cloud.google.com/dlp/docs/likelihood to learn more. 645 "customInfoTypes": [ # CustomInfoTypes provided by the user. See 646 # https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more. 647 { # Custom information type provided by the user. Used to find domain-specific 648 # sensitive information configurable to the data in question. 649 "regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType. 650 "pattern": "A String", # Pattern defining the regular expression. Its syntax 651 # (https://github.com/google/re2/wiki/Syntax) can be found under the 652 # google/re2 repository on GitHub. 653 "groupIndexes": [ # The index of the submatch to extract as findings. When not 654 # specified, the entire match is returned. No more than 3 may be included. 655 42, 656 ], 657 }, 658 "surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that 659 # support reversing. 660 # such as 661 # [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig). 662 # These types of transformations are 663 # those that perform pseudonymization, thereby producing a "surrogate" as 664 # output. This should be used in conjunction with a field on the 665 # transformation such as `surrogate_info_type`. This CustomInfoType does 666 # not support the use of `detection_rules`. 667 }, 668 "infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in 669 # infoType, when the name matches one of existing infoTypes and that infoType 670 # is specified in `InspectContent.info_types` field. Specifying the latter 671 # adds findings to the one detected by the system. If built-in info type is 672 # not specified in `InspectContent.info_types` list then the name is treated 673 # as a custom info type. 674 "name": "A String", # Name of the information type. Either a name of your choosing when 675 # creating a CustomInfoType, or one of the names listed 676 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 677 # a built-in type. InfoType names should conform to the pattern 678 # [a-zA-Z0-9_]{1,64}. 679 }, 680 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType. 681 # be used to match sensitive information specific to the data, such as a list 682 # of employee IDs or job titles. 683 # 684 # Dictionary words are case-insensitive and all characters other than letters 685 # and digits in the unicode [Basic Multilingual 686 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) 687 # will be replaced with whitespace when scanning for matches, so the 688 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", 689 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters 690 # surrounding any match must be of a different type than the adjacent 691 # characters within the word, so letters must be next to non-letters and 692 # digits next to non-digits. For example, the dictionary word "jen" will 693 # match the first three letters of the text "jen123" but will return no 694 # matches for "jennifer". 695 # 696 # Dictionary words containing a large number of characters that are not 697 # letters or digits may result in unexpected findings because such characters 698 # are treated as whitespace. The 699 # [limits](https://cloud.google.com/dlp/limits) page contains details about 700 # the size limits of dictionaries. For dictionaries that do not fit within 701 # these constraints, consider using `LargeCustomDictionaryConfig` in the 702 # `StoredInfoType` API. 703 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. 704 "words": [ # Words or phrases defining the dictionary. The dictionary must contain 705 # at least one phrase and every phrase must contain at least 2 characters 706 # that are letters or digits. [required] 707 "A String", 708 ], 709 }, 710 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file 711 # is accepted. 712 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. 713 # Example: gs://[BUCKET_NAME]/dictionary.txt 714 }, 715 }, 716 "storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in 717 # `InspectDataSource`. Not currently supported in `InspectContent`. 718 "name": "A String", # Resource name of the requested `StoredInfoType`, for example 719 # `organizations/433245324/storedInfoTypes/432452342` or 720 # `projects/project-id/storedInfoTypes/432452342`. 721 "createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for 722 # inspection was created. Output-only field, populated by the system. 723 }, 724 "detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType. 725 # Rules are applied in order that they are specified. Not supported for the 726 # `surrogate_type` CustomInfoType. 727 { # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a 728 # `CustomInfoType` to alter behavior under certain circumstances, depending 729 # on the specific details of the rule. Not supported for the `surrogate_type` 730 # custom infoType. 731 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. 732 # proximity of hotwords. 733 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. 734 # The total length of the window cannot exceed 1000 characters. Note that 735 # the finding itself will be included in the window, so that hotwords may 736 # be used to match substrings of the finding itself. For example, the 737 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be 738 # adjusted upwards if the area code is known to be the local area code of 739 # a company office using the hotword regex "\(xxx\)", where "xxx" 740 # is the area code in question. 741 # rule. 742 "windowAfter": 42, # Number of characters after the finding to consider. 743 "windowBefore": 42, # Number of characters before the finding to consider. 744 }, 745 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. 746 "pattern": "A String", # Pattern defining the regular expression. Its syntax 747 # (https://github.com/google/re2/wiki/Syntax) can be found under the 748 # google/re2 repository on GitHub. 749 "groupIndexes": [ # The index of the submatch to extract as findings. When not 750 # specified, the entire match is returned. No more than 3 may be included. 751 42, 752 ], 753 }, 754 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. 755 # part of a detection rule. 756 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of 757 # levels. For example, if a finding would be `POSSIBLE` without the 758 # detection rule and `relative_likelihood` is 1, then it is upgraded to 759 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. 760 # Likelihood may never drop below `VERY_UNLIKELY` or exceed 761 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an 762 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in 763 # a final likelihood of `LIKELY`. 764 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. 765 }, 766 }, 767 }, 768 ], 769 "exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding 770 # to be returned. It still can be used for rules matching. 771 "likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be 772 # altered by a detection rule if the finding meets the criteria specified by 773 # the rule. Defaults to `VERY_LIKELY` if not specified. 774 }, 775 ], 776 "includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is 777 # included in the response; see Finding.quote. 778 "ruleSet": [ # Set of rules to apply to the findings for this InspectConfig. 779 # Exclusion rules, contained in the set are executed in the end, other 780 # rules are executed in the order they are specified for each info type. 781 { # Rule set for modifying a set of infoTypes to alter behavior under certain 782 # circumstances, depending on the specific details of the rules within the set. 783 "rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order. 784 { # A single inspection rule to be applied to infoTypes, specified in 785 # `InspectionRuleSet`. 786 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. 787 # proximity of hotwords. 788 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. 789 # The total length of the window cannot exceed 1000 characters. Note that 790 # the finding itself will be included in the window, so that hotwords may 791 # be used to match substrings of the finding itself. For example, the 792 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be 793 # adjusted upwards if the area code is known to be the local area code of 794 # a company office using the hotword regex "\(xxx\)", where "xxx" 795 # is the area code in question. 796 # rule. 797 "windowAfter": 42, # Number of characters after the finding to consider. 798 "windowBefore": 42, # Number of characters before the finding to consider. 799 }, 800 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. 801 "pattern": "A String", # Pattern defining the regular expression. Its syntax 802 # (https://github.com/google/re2/wiki/Syntax) can be found under the 803 # google/re2 repository on GitHub. 804 "groupIndexes": [ # The index of the submatch to extract as findings. When not 805 # specified, the entire match is returned. No more than 3 may be included. 806 42, 807 ], 808 }, 809 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. 810 # part of a detection rule. 811 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of 812 # levels. For example, if a finding would be `POSSIBLE` without the 813 # detection rule and `relative_likelihood` is 1, then it is upgraded to 814 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. 815 # Likelihood may never drop below `VERY_UNLIKELY` or exceed 816 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an 817 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in 818 # a final likelihood of `LIKELY`. 819 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. 820 }, 821 }, 822 "exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule. 823 # `InspectionRuleSet` are removed from results. 824 "regex": { # Message defining a custom regular expression. # Regular expression which defines the rule. 825 "pattern": "A String", # Pattern defining the regular expression. Its syntax 826 # (https://github.com/google/re2/wiki/Syntax) can be found under the 827 # google/re2 repository on GitHub. 828 "groupIndexes": [ # The index of the submatch to extract as findings. When not 829 # specified, the entire match is returned. No more than 3 may be included. 830 42, 831 ], 832 }, 833 "excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule. 834 "infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or 835 # contained within with a finding of an infoType from this list. For 836 # example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and 837 # `exclusion_rule` containing `exclude_info_types.info_types` with 838 # "EMAIL_ADDRESS" the phone number findings are dropped if they overlap 839 # with EMAIL_ADDRESS finding. 840 # That leads to "555-222-2222@example.org" to generate only a single 841 # finding, namely email address. 842 { # Type of information detected by the API. 843 "name": "A String", # Name of the information type. Either a name of your choosing when 844 # creating a CustomInfoType, or one of the names listed 845 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 846 # a built-in type. InfoType names should conform to the pattern 847 # [a-zA-Z0-9_]{1,64}. 848 }, 849 ], 850 }, 851 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule. 852 # be used to match sensitive information specific to the data, such as a list 853 # of employee IDs or job titles. 854 # 855 # Dictionary words are case-insensitive and all characters other than letters 856 # and digits in the unicode [Basic Multilingual 857 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) 858 # will be replaced with whitespace when scanning for matches, so the 859 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", 860 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters 861 # surrounding any match must be of a different type than the adjacent 862 # characters within the word, so letters must be next to non-letters and 863 # digits next to non-digits. For example, the dictionary word "jen" will 864 # match the first three letters of the text "jen123" but will return no 865 # matches for "jennifer". 866 # 867 # Dictionary words containing a large number of characters that are not 868 # letters or digits may result in unexpected findings because such characters 869 # are treated as whitespace. The 870 # [limits](https://cloud.google.com/dlp/limits) page contains details about 871 # the size limits of dictionaries. For dictionaries that do not fit within 872 # these constraints, consider using `LargeCustomDictionaryConfig` in the 873 # `StoredInfoType` API. 874 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. 875 "words": [ # Words or phrases defining the dictionary. The dictionary must contain 876 # at least one phrase and every phrase must contain at least 2 characters 877 # that are letters or digits. [required] 878 "A String", 879 ], 880 }, 881 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file 882 # is accepted. 883 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. 884 # Example: gs://[BUCKET_NAME]/dictionary.txt 885 }, 886 }, 887 "matchingType": "A String", # How the rule is applied, see MatchingType documentation for details. 888 }, 889 }, 890 ], 891 "infoTypes": [ # List of infoTypes this rule set is applied to. 892 { # Type of information detected by the API. 893 "name": "A String", # Name of the information type. Either a name of your choosing when 894 # creating a CustomInfoType, or one of the names listed 895 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 896 # a built-in type. InfoType names should conform to the pattern 897 # [a-zA-Z0-9_]{1,64}. 898 }, 899 ], 900 }, 901 ], 902 "contentOptions": [ # List of options defining data content to scan. 903 # If empty, text, images, and other content will be included. 904 "A String", 905 ], 906 "infoTypes": [ # Restricts what info_types to look for. The values must correspond to 907 # InfoType values returned by ListInfoTypes or listed at 908 # https://cloud.google.com/dlp/docs/infotypes-reference. 909 # 910 # When no InfoTypes or CustomInfoTypes are specified in a request, the 911 # system may automatically choose what detectors to run. By default this may 912 # be all types, but may change over time as detectors are updated. 913 # 914 # The special InfoType name "ALL_BASIC" can be used to trigger all detectors, 915 # but may change over time as new InfoTypes are added. If you need precise 916 # control and predictability as to what detectors are run you should specify 917 # specific InfoTypes listed in the reference. 918 { # Type of information detected by the API. 919 "name": "A String", # Name of the information type. Either a name of your choosing when 920 # creating a CustomInfoType, or one of the names listed 921 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 922 # a built-in type. InfoType names should conform to the pattern 923 # [a-zA-Z0-9_]{1,64}. 924 }, 925 ], 926 }, 927 "inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig. 928 # `inspect_config` will be merged into the values persisted as part of the 929 # template. 930 "actions": [ # Actions to execute at the completion of the job. 931 { # A task to execute on the completion of a job. 932 # See https://cloud.google.com/dlp/docs/concepts-actions to learn more. 933 "saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location. 934 # OutputStorageConfig. Only a single instance of this action can be 935 # specified. 936 # Compatible with: Inspect, Risk 937 "outputConfig": { # Cloud repository for storing output. 938 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing 939 # dataset. If table_id is not set a new one will be generated 940 # for you with the following format: 941 # dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for 942 # generating the date details. 943 # 944 # For Inspect, each column in an existing output table must have the same 945 # name, type, and mode of a field in the `Finding` object. 946 # 947 # For Risk, an existing output table should be the output of a previous 948 # Risk analysis job run on the same source table, with the same privacy 949 # metric and quasi-identifiers. Risk jobs that analyze the same table but 950 # compute a different privacy metric, or use different sets of 951 # quasi-identifiers, cannot store their results in the same table. 952 # identified by its project_id, dataset_id, and table_name. Within a query 953 # a table is often referenced with a string in the format of: 954 # `<project_id>:<dataset_id>.<table_id>` or 955 # `<project_id>.<dataset_id>.<table_id>`. 956 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. 957 # If omitted, project ID is inferred from the API call. 958 "tableId": "A String", # Name of the table. 959 "datasetId": "A String", # Dataset ID of the table. 960 }, 961 "outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only 962 # used for Inspect and must be unspecified for Risk jobs. Columns are derived 963 # from the `Finding` object. If appending to an existing table, any columns 964 # from the predefined schema that are missing will be added. No columns in 965 # the existing table will be deleted. 966 # 967 # If unspecified, then all available columns will be used for a new table or 968 # an (existing) table with no schema, and no changes will be made to an 969 # existing table that has a schema. 970 }, 971 }, 972 "jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification to project owners and editors on job's 973 # completion/failure. 974 # completion/failure. 975 }, 976 "publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha). 977 # Command Center (CSCC Alpha). 978 # This action is only available for projects which are parts of 979 # an organization and whitelisted for the alpha Cloud Security Command 980 # Center. 981 # The action will publish count of finding instances and their info types. 982 # The summary of findings will be persisted in CSCC and are governed by CSCC 983 # service-specific policy, see https://cloud.google.com/terms/service-terms 984 # Only a single instance of this action can be specified. 985 # Compatible with: Inspect 986 }, 987 "pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic. 988 # message contains a single field, `DlpJobName`, which is equal to the 989 # finished job's 990 # [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob). 991 # Compatible with: Inspect, Risk 992 "topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given 993 # publishing access rights to the DLP API service account executing 994 # the long running DlpJob sending the notifications. 995 # Format is projects/{project}/topics/{topic}. 996 }, 997 }, 998 ], 999 }, 1000 } 1001 1002 x__xgafv: string, V1 error format. 1003 Allowed values 1004 1 - v1 error format 1005 2 - v2 error format 1006 1007Returns: 1008 An object of the form: 1009 1010 { # Combines all of the information about a DLP job. 1011 "errors": [ # A stream of errors encountered running the job. 1012 { # Details information about an error encountered during job execution or 1013 # the results of an unsuccessful activation of the JobTrigger. 1014 # Output only field. 1015 "timestamps": [ # The times the error occurred. 1016 "A String", 1017 ], 1018 "details": { # The `Status` type defines a logical error model that is suitable for 1019 # different programming environments, including REST APIs and RPC APIs. It is 1020 # used by [gRPC](https://github.com/grpc). Each `Status` message contains 1021 # three pieces of data: error code, error message, and error details. 1022 # 1023 # You can find out more about this error model and how to work with it in the 1024 # [API Design Guide](https://cloud.google.com/apis/design/errors). 1025 "message": "A String", # A developer-facing error message, which should be in English. Any 1026 # user-facing error message should be localized and sent in the 1027 # google.rpc.Status.details field, or localized by the client. 1028 "code": 42, # The status code, which should be an enum value of google.rpc.Code. 1029 "details": [ # A list of messages that carry the error details. There is a common set of 1030 # message types for APIs to use. 1031 { 1032 "a_key": "", # Properties of the object. Contains field @type with type URL. 1033 }, 1034 ], 1035 }, 1036 }, 1037 ], 1038 "name": "A String", # The server-assigned name. 1039 "inspectDetails": { # The results of an inspect DataSource job. # Results from inspecting a data source. 1040 "requestedOptions": { # The configuration used for this job. 1041 "snapshotInspectTemplate": { # The inspectTemplate contains a configuration (set of types of sensitive data # If run with an InspectTemplate, a snapshot of its state at the time of 1042 # this run. 1043 # to be detected) to be used anywhere you otherwise would normally specify 1044 # InspectConfig. See https://cloud.google.com/dlp/docs/concepts-templates 1045 # to learn more. 1046 "updateTime": "A String", # The last update timestamp of a inspectTemplate, output only field. 1047 "displayName": "A String", # Display name (max 256 chars). 1048 "description": "A String", # Short description (max 256 chars). 1049 "inspectConfig": { # Configuration description of the scanning process. # The core content of the template. Configuration of the scanning process. 1050 # When used with redactContent only info_types and min_likelihood are currently 1051 # used. 1052 "excludeInfoTypes": True or False, # When true, excludes type information of the findings. 1053 "limits": { 1054 "maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job. 1055 # When set within `InspectContentRequest`, the maximum returned is 2000 1056 # regardless if this is set higher. 1057 "maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes. 1058 { # Max findings configuration per infoType, per content item or long 1059 # running DlpJob. 1060 "infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per 1061 # info_type should be provided. If InfoTypeLimit does not have an 1062 # info_type, the DLP API applies the limit against all info_types that 1063 # are found but not specified in another InfoTypeLimit. 1064 "name": "A String", # Name of the information type. Either a name of your choosing when 1065 # creating a CustomInfoType, or one of the names listed 1066 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 1067 # a built-in type. InfoType names should conform to the pattern 1068 # [a-zA-Z0-9_]{1,64}. 1069 }, 1070 "maxFindings": 42, # Max findings limit for the given infoType. 1071 }, 1072 ], 1073 "maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned. 1074 # When set within `InspectDataSourceRequest`, 1075 # the maximum returned is 2000 regardless if this is set higher. 1076 # When set within `InspectContentRequest`, this field is ignored. 1077 }, 1078 "minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is 1079 # POSSIBLE. 1080 # See https://cloud.google.com/dlp/docs/likelihood to learn more. 1081 "customInfoTypes": [ # CustomInfoTypes provided by the user. See 1082 # https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more. 1083 { # Custom information type provided by the user. Used to find domain-specific 1084 # sensitive information configurable to the data in question. 1085 "regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType. 1086 "pattern": "A String", # Pattern defining the regular expression. Its syntax 1087 # (https://github.com/google/re2/wiki/Syntax) can be found under the 1088 # google/re2 repository on GitHub. 1089 "groupIndexes": [ # The index of the submatch to extract as findings. When not 1090 # specified, the entire match is returned. No more than 3 may be included. 1091 42, 1092 ], 1093 }, 1094 "surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that 1095 # support reversing. 1096 # such as 1097 # [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig). 1098 # These types of transformations are 1099 # those that perform pseudonymization, thereby producing a "surrogate" as 1100 # output. This should be used in conjunction with a field on the 1101 # transformation such as `surrogate_info_type`. This CustomInfoType does 1102 # not support the use of `detection_rules`. 1103 }, 1104 "infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in 1105 # infoType, when the name matches one of existing infoTypes and that infoType 1106 # is specified in `InspectContent.info_types` field. Specifying the latter 1107 # adds findings to the one detected by the system. If built-in info type is 1108 # not specified in `InspectContent.info_types` list then the name is treated 1109 # as a custom info type. 1110 "name": "A String", # Name of the information type. Either a name of your choosing when 1111 # creating a CustomInfoType, or one of the names listed 1112 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 1113 # a built-in type. InfoType names should conform to the pattern 1114 # [a-zA-Z0-9_]{1,64}. 1115 }, 1116 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType. 1117 # be used to match sensitive information specific to the data, such as a list 1118 # of employee IDs or job titles. 1119 # 1120 # Dictionary words are case-insensitive and all characters other than letters 1121 # and digits in the unicode [Basic Multilingual 1122 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) 1123 # will be replaced with whitespace when scanning for matches, so the 1124 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", 1125 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters 1126 # surrounding any match must be of a different type than the adjacent 1127 # characters within the word, so letters must be next to non-letters and 1128 # digits next to non-digits. For example, the dictionary word "jen" will 1129 # match the first three letters of the text "jen123" but will return no 1130 # matches for "jennifer". 1131 # 1132 # Dictionary words containing a large number of characters that are not 1133 # letters or digits may result in unexpected findings because such characters 1134 # are treated as whitespace. The 1135 # [limits](https://cloud.google.com/dlp/limits) page contains details about 1136 # the size limits of dictionaries. For dictionaries that do not fit within 1137 # these constraints, consider using `LargeCustomDictionaryConfig` in the 1138 # `StoredInfoType` API. 1139 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. 1140 "words": [ # Words or phrases defining the dictionary. The dictionary must contain 1141 # at least one phrase and every phrase must contain at least 2 characters 1142 # that are letters or digits. [required] 1143 "A String", 1144 ], 1145 }, 1146 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file 1147 # is accepted. 1148 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. 1149 # Example: gs://[BUCKET_NAME]/dictionary.txt 1150 }, 1151 }, 1152 "storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in 1153 # `InspectDataSource`. Not currently supported in `InspectContent`. 1154 "name": "A String", # Resource name of the requested `StoredInfoType`, for example 1155 # `organizations/433245324/storedInfoTypes/432452342` or 1156 # `projects/project-id/storedInfoTypes/432452342`. 1157 "createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for 1158 # inspection was created. Output-only field, populated by the system. 1159 }, 1160 "detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType. 1161 # Rules are applied in order that they are specified. Not supported for the 1162 # `surrogate_type` CustomInfoType. 1163 { # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a 1164 # `CustomInfoType` to alter behavior under certain circumstances, depending 1165 # on the specific details of the rule. Not supported for the `surrogate_type` 1166 # custom infoType. 1167 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. 1168 # proximity of hotwords. 1169 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. 1170 # The total length of the window cannot exceed 1000 characters. Note that 1171 # the finding itself will be included in the window, so that hotwords may 1172 # be used to match substrings of the finding itself. For example, the 1173 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be 1174 # adjusted upwards if the area code is known to be the local area code of 1175 # a company office using the hotword regex "\(xxx\)", where "xxx" 1176 # is the area code in question. 1177 # rule. 1178 "windowAfter": 42, # Number of characters after the finding to consider. 1179 "windowBefore": 42, # Number of characters before the finding to consider. 1180 }, 1181 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. 1182 "pattern": "A String", # Pattern defining the regular expression. Its syntax 1183 # (https://github.com/google/re2/wiki/Syntax) can be found under the 1184 # google/re2 repository on GitHub. 1185 "groupIndexes": [ # The index of the submatch to extract as findings. When not 1186 # specified, the entire match is returned. No more than 3 may be included. 1187 42, 1188 ], 1189 }, 1190 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. 1191 # part of a detection rule. 1192 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of 1193 # levels. For example, if a finding would be `POSSIBLE` without the 1194 # detection rule and `relative_likelihood` is 1, then it is upgraded to 1195 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. 1196 # Likelihood may never drop below `VERY_UNLIKELY` or exceed 1197 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an 1198 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in 1199 # a final likelihood of `LIKELY`. 1200 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. 1201 }, 1202 }, 1203 }, 1204 ], 1205 "exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding 1206 # to be returned. It still can be used for rules matching. 1207 "likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be 1208 # altered by a detection rule if the finding meets the criteria specified by 1209 # the rule. Defaults to `VERY_LIKELY` if not specified. 1210 }, 1211 ], 1212 "includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is 1213 # included in the response; see Finding.quote. 1214 "ruleSet": [ # Set of rules to apply to the findings for this InspectConfig. 1215 # Exclusion rules, contained in the set are executed in the end, other 1216 # rules are executed in the order they are specified for each info type. 1217 { # Rule set for modifying a set of infoTypes to alter behavior under certain 1218 # circumstances, depending on the specific details of the rules within the set. 1219 "rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order. 1220 { # A single inspection rule to be applied to infoTypes, specified in 1221 # `InspectionRuleSet`. 1222 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. 1223 # proximity of hotwords. 1224 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. 1225 # The total length of the window cannot exceed 1000 characters. Note that 1226 # the finding itself will be included in the window, so that hotwords may 1227 # be used to match substrings of the finding itself. For example, the 1228 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be 1229 # adjusted upwards if the area code is known to be the local area code of 1230 # a company office using the hotword regex "\(xxx\)", where "xxx" 1231 # is the area code in question. 1232 # rule. 1233 "windowAfter": 42, # Number of characters after the finding to consider. 1234 "windowBefore": 42, # Number of characters before the finding to consider. 1235 }, 1236 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. 1237 "pattern": "A String", # Pattern defining the regular expression. Its syntax 1238 # (https://github.com/google/re2/wiki/Syntax) can be found under the 1239 # google/re2 repository on GitHub. 1240 "groupIndexes": [ # The index of the submatch to extract as findings. When not 1241 # specified, the entire match is returned. No more than 3 may be included. 1242 42, 1243 ], 1244 }, 1245 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. 1246 # part of a detection rule. 1247 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of 1248 # levels. For example, if a finding would be `POSSIBLE` without the 1249 # detection rule and `relative_likelihood` is 1, then it is upgraded to 1250 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. 1251 # Likelihood may never drop below `VERY_UNLIKELY` or exceed 1252 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an 1253 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in 1254 # a final likelihood of `LIKELY`. 1255 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. 1256 }, 1257 }, 1258 "exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule. 1259 # `InspectionRuleSet` are removed from results. 1260 "regex": { # Message defining a custom regular expression. # Regular expression which defines the rule. 1261 "pattern": "A String", # Pattern defining the regular expression. Its syntax 1262 # (https://github.com/google/re2/wiki/Syntax) can be found under the 1263 # google/re2 repository on GitHub. 1264 "groupIndexes": [ # The index of the submatch to extract as findings. When not 1265 # specified, the entire match is returned. No more than 3 may be included. 1266 42, 1267 ], 1268 }, 1269 "excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule. 1270 "infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or 1271 # contained within with a finding of an infoType from this list. For 1272 # example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and 1273 # `exclusion_rule` containing `exclude_info_types.info_types` with 1274 # "EMAIL_ADDRESS" the phone number findings are dropped if they overlap 1275 # with EMAIL_ADDRESS finding. 1276 # That leads to "555-222-2222@example.org" to generate only a single 1277 # finding, namely email address. 1278 { # Type of information detected by the API. 1279 "name": "A String", # Name of the information type. Either a name of your choosing when 1280 # creating a CustomInfoType, or one of the names listed 1281 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 1282 # a built-in type. InfoType names should conform to the pattern 1283 # [a-zA-Z0-9_]{1,64}. 1284 }, 1285 ], 1286 }, 1287 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule. 1288 # be used to match sensitive information specific to the data, such as a list 1289 # of employee IDs or job titles. 1290 # 1291 # Dictionary words are case-insensitive and all characters other than letters 1292 # and digits in the unicode [Basic Multilingual 1293 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) 1294 # will be replaced with whitespace when scanning for matches, so the 1295 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", 1296 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters 1297 # surrounding any match must be of a different type than the adjacent 1298 # characters within the word, so letters must be next to non-letters and 1299 # digits next to non-digits. For example, the dictionary word "jen" will 1300 # match the first three letters of the text "jen123" but will return no 1301 # matches for "jennifer". 1302 # 1303 # Dictionary words containing a large number of characters that are not 1304 # letters or digits may result in unexpected findings because such characters 1305 # are treated as whitespace. The 1306 # [limits](https://cloud.google.com/dlp/limits) page contains details about 1307 # the size limits of dictionaries. For dictionaries that do not fit within 1308 # these constraints, consider using `LargeCustomDictionaryConfig` in the 1309 # `StoredInfoType` API. 1310 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. 1311 "words": [ # Words or phrases defining the dictionary. The dictionary must contain 1312 # at least one phrase and every phrase must contain at least 2 characters 1313 # that are letters or digits. [required] 1314 "A String", 1315 ], 1316 }, 1317 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file 1318 # is accepted. 1319 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. 1320 # Example: gs://[BUCKET_NAME]/dictionary.txt 1321 }, 1322 }, 1323 "matchingType": "A String", # How the rule is applied, see MatchingType documentation for details. 1324 }, 1325 }, 1326 ], 1327 "infoTypes": [ # List of infoTypes this rule set is applied to. 1328 { # Type of information detected by the API. 1329 "name": "A String", # Name of the information type. Either a name of your choosing when 1330 # creating a CustomInfoType, or one of the names listed 1331 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 1332 # a built-in type. InfoType names should conform to the pattern 1333 # [a-zA-Z0-9_]{1,64}. 1334 }, 1335 ], 1336 }, 1337 ], 1338 "contentOptions": [ # List of options defining data content to scan. 1339 # If empty, text, images, and other content will be included. 1340 "A String", 1341 ], 1342 "infoTypes": [ # Restricts what info_types to look for. The values must correspond to 1343 # InfoType values returned by ListInfoTypes or listed at 1344 # https://cloud.google.com/dlp/docs/infotypes-reference. 1345 # 1346 # When no InfoTypes or CustomInfoTypes are specified in a request, the 1347 # system may automatically choose what detectors to run. By default this may 1348 # be all types, but may change over time as detectors are updated. 1349 # 1350 # The special InfoType name "ALL_BASIC" can be used to trigger all detectors, 1351 # but may change over time as new InfoTypes are added. If you need precise 1352 # control and predictability as to what detectors are run you should specify 1353 # specific InfoTypes listed in the reference. 1354 { # Type of information detected by the API. 1355 "name": "A String", # Name of the information type. Either a name of your choosing when 1356 # creating a CustomInfoType, or one of the names listed 1357 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 1358 # a built-in type. InfoType names should conform to the pattern 1359 # [a-zA-Z0-9_]{1,64}. 1360 }, 1361 ], 1362 }, 1363 "createTime": "A String", # The creation timestamp of a inspectTemplate, output only field. 1364 "name": "A String", # The template name. Output only. 1365 # 1366 # The template will have one of the following formats: 1367 # `projects/PROJECT_ID/inspectTemplates/TEMPLATE_ID` OR 1368 # `organizations/ORGANIZATION_ID/inspectTemplates/TEMPLATE_ID` 1369 }, 1370 "jobConfig": { 1371 "storageConfig": { # Shared message indicating Cloud storage type. # The data to scan. 1372 "datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options specification. 1373 "partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always 1374 # by project and namespace, however the namespace ID may be empty. 1375 # A partition ID identifies a grouping of entities. The grouping is always 1376 # by project and namespace, however the namespace ID may be empty. 1377 # 1378 # A partition ID contains several dimensions: 1379 # project ID and namespace ID. 1380 "projectId": "A String", # The ID of the project to which the entities belong. 1381 "namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong. 1382 }, 1383 "kind": { # A representation of a Datastore kind. # The kind to process. 1384 "name": "A String", # The name of the kind. 1385 }, 1386 }, 1387 "bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options specification. 1388 "excludedFields": [ # References to fields excluded from scanning. This allows you to skip 1389 # inspection of entire columns which you know have no findings. 1390 { # General identifier of a data field in a storage service. 1391 "name": "A String", # Name describing the field. 1392 }, 1393 ], 1394 "rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the 1395 # rest of the rows are omitted. If not set, or if set to 0, all rows will be 1396 # scanned. Only one of rows_limit and rows_limit_percent can be specified. 1397 # Cannot be used in conjunction with TimespanConfig. 1398 "sampleMethod": "A String", 1399 "identifyingFields": [ # References to fields uniquely identifying rows within the table. 1400 # Nested fields in the format, like `person.birthdate.year`, are allowed. 1401 { # General identifier of a data field in a storage service. 1402 "name": "A String", # Name describing the field. 1403 }, 1404 ], 1405 "rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows 1406 # scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and 1407 # 100 means no limit. Defaults to 0. Only one of rows_limit and 1408 # rows_limit_percent can be specified. Cannot be used in conjunction with 1409 # TimespanConfig. 1410 "tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference. 1411 # identified by its project_id, dataset_id, and table_name. Within a query 1412 # a table is often referenced with a string in the format of: 1413 # `<project_id>:<dataset_id>.<table_id>` or 1414 # `<project_id>.<dataset_id>.<table_id>`. 1415 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. 1416 # If omitted, project ID is inferred from the API call. 1417 "tableId": "A String", # Name of the table. 1418 "datasetId": "A String", # Dataset ID of the table. 1419 }, 1420 }, 1421 "timespanConfig": { # Configuration of the timespan of the items to include in scanning. 1422 # Currently only supported when inspecting Google Cloud Storage and BigQuery. 1423 "timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items. 1424 # Used for data sources like Datastore or BigQuery. 1425 # If not specified for BigQuery, table last modification timestamp 1426 # is checked against given time span. 1427 # The valid data types of the timestamp field are: 1428 # for BigQuery - timestamp, date, datetime; 1429 # for Datastore - timestamp. 1430 # Datastore entity will be scanned if the timestamp property does not exist 1431 # or its value is empty or invalid. 1432 "name": "A String", # Name describing the field. 1433 }, 1434 "endTime": "A String", # Exclude files or rows newer than this value. 1435 # If set to zero, no upper time limit is applied. 1436 "startTime": "A String", # Exclude files or rows older than this value. 1437 "enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out 1438 # a valid start_time to avoid scanning files that have not been modified 1439 # since the last time the JobTrigger executed. This will be based on the 1440 # time of the execution of the last run of the JobTrigger. 1441 }, 1442 "cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options specification. 1443 # bucket. 1444 "bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger 1445 # than this value then the rest of the bytes are omitted. Only one 1446 # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified. 1447 "sampleMethod": "A String", 1448 "fileSet": { # Set of files to scan. # The set of one or more files to scan. 1449 "url": "A String", # The Cloud Storage url of the file(s) to scan, in the format 1450 # `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed. 1451 # 1452 # If the url ends in a trailing slash, the bucket or directory represented 1453 # by the url will be scanned non-recursively (content in sub-directories 1454 # will not be scanned). This means that `gs://mybucket/` is equivalent to 1455 # `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to 1456 # `gs://mybucket/directory/*`. 1457 # 1458 # Exactly one of `url` or `regex_file_set` must be set. 1459 "regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or 1460 # `regex_file_set` must be set. 1461 # expressions are used to allow fine-grained control over which files in the 1462 # bucket to include. 1463 # 1464 # Included files are those that match at least one item in `include_regex` and 1465 # do not match any items in `exclude_regex`. Note that a file that matches 1466 # items from both lists will _not_ be included. For a match to occur, the 1467 # entire file path (i.e., everything in the url after the bucket name) must 1468 # match the regular expression. 1469 # 1470 # For example, given the input `{bucket_name: "mybucket", include_regex: 1471 # ["directory1/.*"], exclude_regex: 1472 # ["directory1/excluded.*"]}`: 1473 # 1474 # * `gs://mybucket/directory1/myfile` will be included 1475 # * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches 1476 # across `/`) 1477 # * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the 1478 # full path doesn't match any items in `include_regex`) 1479 # * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path 1480 # matches an item in `exclude_regex`) 1481 # 1482 # If `include_regex` is left empty, it will match all files by default 1483 # (this is equivalent to setting `include_regex: [".*"]`). 1484 # 1485 # Some other common use cases: 1486 # 1487 # * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all 1488 # files in `mybucket` except for .pdf files 1489 # * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will 1490 # include all files directly under `gs://mybucket/directory/`, without matching 1491 # across `/` 1492 "excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in 1493 # the bucket that match at least one of these regular expressions will be 1494 # excluded from the scan. 1495 # 1496 # Regular expressions use RE2 1497 # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found 1498 # under the google/re2 repository on GitHub. 1499 "A String", 1500 ], 1501 "bucketName": "A String", # The name of a Cloud Storage bucket. Required. 1502 "includeRegex": [ # A list of regular expressions matching file paths to include. All files in 1503 # the bucket that match at least one of these regular expressions will be 1504 # included in the set of files, except for those that also match an item in 1505 # `exclude_regex`. Leaving this field empty will match all files by default 1506 # (this is equivalent to including `.*` in the list). 1507 # 1508 # Regular expressions use RE2 1509 # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found 1510 # under the google/re2 repository on GitHub. 1511 "A String", 1512 ], 1513 }, 1514 }, 1515 "bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The 1516 # number of bytes scanned is rounded down. Must be between 0 and 100, 1517 # inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one 1518 # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified. 1519 "filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet. 1520 # Number of files scanned is rounded down. Must be between 0 and 100, 1521 # inclusively. Both 0 and 100 means no limit. Defaults to 0. 1522 "fileTypes": [ # List of file type groups to include in the scan. 1523 # If empty, all files are scanned and available data format processors 1524 # are applied. In addition, the binary content of the selected files 1525 # is always scanned as well. 1526 "A String", 1527 ], 1528 }, 1529 }, 1530 "inspectConfig": { # Configuration description of the scanning process. # How and what to scan for. 1531 # When used with redactContent only info_types and min_likelihood are currently 1532 # used. 1533 "excludeInfoTypes": True or False, # When true, excludes type information of the findings. 1534 "limits": { 1535 "maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job. 1536 # When set within `InspectContentRequest`, the maximum returned is 2000 1537 # regardless if this is set higher. 1538 "maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes. 1539 { # Max findings configuration per infoType, per content item or long 1540 # running DlpJob. 1541 "infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per 1542 # info_type should be provided. If InfoTypeLimit does not have an 1543 # info_type, the DLP API applies the limit against all info_types that 1544 # are found but not specified in another InfoTypeLimit. 1545 "name": "A String", # Name of the information type. Either a name of your choosing when 1546 # creating a CustomInfoType, or one of the names listed 1547 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 1548 # a built-in type. InfoType names should conform to the pattern 1549 # [a-zA-Z0-9_]{1,64}. 1550 }, 1551 "maxFindings": 42, # Max findings limit for the given infoType. 1552 }, 1553 ], 1554 "maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned. 1555 # When set within `InspectDataSourceRequest`, 1556 # the maximum returned is 2000 regardless if this is set higher. 1557 # When set within `InspectContentRequest`, this field is ignored. 1558 }, 1559 "minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is 1560 # POSSIBLE. 1561 # See https://cloud.google.com/dlp/docs/likelihood to learn more. 1562 "customInfoTypes": [ # CustomInfoTypes provided by the user. See 1563 # https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more. 1564 { # Custom information type provided by the user. Used to find domain-specific 1565 # sensitive information configurable to the data in question. 1566 "regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType. 1567 "pattern": "A String", # Pattern defining the regular expression. Its syntax 1568 # (https://github.com/google/re2/wiki/Syntax) can be found under the 1569 # google/re2 repository on GitHub. 1570 "groupIndexes": [ # The index of the submatch to extract as findings. When not 1571 # specified, the entire match is returned. No more than 3 may be included. 1572 42, 1573 ], 1574 }, 1575 "surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that 1576 # support reversing. 1577 # such as 1578 # [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig). 1579 # These types of transformations are 1580 # those that perform pseudonymization, thereby producing a "surrogate" as 1581 # output. This should be used in conjunction with a field on the 1582 # transformation such as `surrogate_info_type`. This CustomInfoType does 1583 # not support the use of `detection_rules`. 1584 }, 1585 "infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in 1586 # infoType, when the name matches one of existing infoTypes and that infoType 1587 # is specified in `InspectContent.info_types` field. Specifying the latter 1588 # adds findings to the one detected by the system. If built-in info type is 1589 # not specified in `InspectContent.info_types` list then the name is treated 1590 # as a custom info type. 1591 "name": "A String", # Name of the information type. Either a name of your choosing when 1592 # creating a CustomInfoType, or one of the names listed 1593 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 1594 # a built-in type. InfoType names should conform to the pattern 1595 # [a-zA-Z0-9_]{1,64}. 1596 }, 1597 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType. 1598 # be used to match sensitive information specific to the data, such as a list 1599 # of employee IDs or job titles. 1600 # 1601 # Dictionary words are case-insensitive and all characters other than letters 1602 # and digits in the unicode [Basic Multilingual 1603 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) 1604 # will be replaced with whitespace when scanning for matches, so the 1605 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", 1606 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters 1607 # surrounding any match must be of a different type than the adjacent 1608 # characters within the word, so letters must be next to non-letters and 1609 # digits next to non-digits. For example, the dictionary word "jen" will 1610 # match the first three letters of the text "jen123" but will return no 1611 # matches for "jennifer". 1612 # 1613 # Dictionary words containing a large number of characters that are not 1614 # letters or digits may result in unexpected findings because such characters 1615 # are treated as whitespace. The 1616 # [limits](https://cloud.google.com/dlp/limits) page contains details about 1617 # the size limits of dictionaries. For dictionaries that do not fit within 1618 # these constraints, consider using `LargeCustomDictionaryConfig` in the 1619 # `StoredInfoType` API. 1620 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. 1621 "words": [ # Words or phrases defining the dictionary. The dictionary must contain 1622 # at least one phrase and every phrase must contain at least 2 characters 1623 # that are letters or digits. [required] 1624 "A String", 1625 ], 1626 }, 1627 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file 1628 # is accepted. 1629 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. 1630 # Example: gs://[BUCKET_NAME]/dictionary.txt 1631 }, 1632 }, 1633 "storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in 1634 # `InspectDataSource`. Not currently supported in `InspectContent`. 1635 "name": "A String", # Resource name of the requested `StoredInfoType`, for example 1636 # `organizations/433245324/storedInfoTypes/432452342` or 1637 # `projects/project-id/storedInfoTypes/432452342`. 1638 "createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for 1639 # inspection was created. Output-only field, populated by the system. 1640 }, 1641 "detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType. 1642 # Rules are applied in order that they are specified. Not supported for the 1643 # `surrogate_type` CustomInfoType. 1644 { # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a 1645 # `CustomInfoType` to alter behavior under certain circumstances, depending 1646 # on the specific details of the rule. Not supported for the `surrogate_type` 1647 # custom infoType. 1648 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. 1649 # proximity of hotwords. 1650 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. 1651 # The total length of the window cannot exceed 1000 characters. Note that 1652 # the finding itself will be included in the window, so that hotwords may 1653 # be used to match substrings of the finding itself. For example, the 1654 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be 1655 # adjusted upwards if the area code is known to be the local area code of 1656 # a company office using the hotword regex "\(xxx\)", where "xxx" 1657 # is the area code in question. 1658 # rule. 1659 "windowAfter": 42, # Number of characters after the finding to consider. 1660 "windowBefore": 42, # Number of characters before the finding to consider. 1661 }, 1662 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. 1663 "pattern": "A String", # Pattern defining the regular expression. Its syntax 1664 # (https://github.com/google/re2/wiki/Syntax) can be found under the 1665 # google/re2 repository on GitHub. 1666 "groupIndexes": [ # The index of the submatch to extract as findings. When not 1667 # specified, the entire match is returned. No more than 3 may be included. 1668 42, 1669 ], 1670 }, 1671 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. 1672 # part of a detection rule. 1673 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of 1674 # levels. For example, if a finding would be `POSSIBLE` without the 1675 # detection rule and `relative_likelihood` is 1, then it is upgraded to 1676 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. 1677 # Likelihood may never drop below `VERY_UNLIKELY` or exceed 1678 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an 1679 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in 1680 # a final likelihood of `LIKELY`. 1681 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. 1682 }, 1683 }, 1684 }, 1685 ], 1686 "exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding 1687 # to be returned. It still can be used for rules matching. 1688 "likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be 1689 # altered by a detection rule if the finding meets the criteria specified by 1690 # the rule. Defaults to `VERY_LIKELY` if not specified. 1691 }, 1692 ], 1693 "includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is 1694 # included in the response; see Finding.quote. 1695 "ruleSet": [ # Set of rules to apply to the findings for this InspectConfig. 1696 # Exclusion rules, contained in the set are executed in the end, other 1697 # rules are executed in the order they are specified for each info type. 1698 { # Rule set for modifying a set of infoTypes to alter behavior under certain 1699 # circumstances, depending on the specific details of the rules within the set. 1700 "rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order. 1701 { # A single inspection rule to be applied to infoTypes, specified in 1702 # `InspectionRuleSet`. 1703 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. 1704 # proximity of hotwords. 1705 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. 1706 # The total length of the window cannot exceed 1000 characters. Note that 1707 # the finding itself will be included in the window, so that hotwords may 1708 # be used to match substrings of the finding itself. For example, the 1709 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be 1710 # adjusted upwards if the area code is known to be the local area code of 1711 # a company office using the hotword regex "\(xxx\)", where "xxx" 1712 # is the area code in question. 1713 # rule. 1714 "windowAfter": 42, # Number of characters after the finding to consider. 1715 "windowBefore": 42, # Number of characters before the finding to consider. 1716 }, 1717 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. 1718 "pattern": "A String", # Pattern defining the regular expression. Its syntax 1719 # (https://github.com/google/re2/wiki/Syntax) can be found under the 1720 # google/re2 repository on GitHub. 1721 "groupIndexes": [ # The index of the submatch to extract as findings. When not 1722 # specified, the entire match is returned. No more than 3 may be included. 1723 42, 1724 ], 1725 }, 1726 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. 1727 # part of a detection rule. 1728 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of 1729 # levels. For example, if a finding would be `POSSIBLE` without the 1730 # detection rule and `relative_likelihood` is 1, then it is upgraded to 1731 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. 1732 # Likelihood may never drop below `VERY_UNLIKELY` or exceed 1733 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an 1734 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in 1735 # a final likelihood of `LIKELY`. 1736 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. 1737 }, 1738 }, 1739 "exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule. 1740 # `InspectionRuleSet` are removed from results. 1741 "regex": { # Message defining a custom regular expression. # Regular expression which defines the rule. 1742 "pattern": "A String", # Pattern defining the regular expression. Its syntax 1743 # (https://github.com/google/re2/wiki/Syntax) can be found under the 1744 # google/re2 repository on GitHub. 1745 "groupIndexes": [ # The index of the submatch to extract as findings. When not 1746 # specified, the entire match is returned. No more than 3 may be included. 1747 42, 1748 ], 1749 }, 1750 "excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule. 1751 "infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or 1752 # contained within with a finding of an infoType from this list. For 1753 # example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and 1754 # `exclusion_rule` containing `exclude_info_types.info_types` with 1755 # "EMAIL_ADDRESS" the phone number findings are dropped if they overlap 1756 # with EMAIL_ADDRESS finding. 1757 # That leads to "555-222-2222@example.org" to generate only a single 1758 # finding, namely email address. 1759 { # Type of information detected by the API. 1760 "name": "A String", # Name of the information type. Either a name of your choosing when 1761 # creating a CustomInfoType, or one of the names listed 1762 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 1763 # a built-in type. InfoType names should conform to the pattern 1764 # [a-zA-Z0-9_]{1,64}. 1765 }, 1766 ], 1767 }, 1768 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule. 1769 # be used to match sensitive information specific to the data, such as a list 1770 # of employee IDs or job titles. 1771 # 1772 # Dictionary words are case-insensitive and all characters other than letters 1773 # and digits in the unicode [Basic Multilingual 1774 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) 1775 # will be replaced with whitespace when scanning for matches, so the 1776 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", 1777 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters 1778 # surrounding any match must be of a different type than the adjacent 1779 # characters within the word, so letters must be next to non-letters and 1780 # digits next to non-digits. For example, the dictionary word "jen" will 1781 # match the first three letters of the text "jen123" but will return no 1782 # matches for "jennifer". 1783 # 1784 # Dictionary words containing a large number of characters that are not 1785 # letters or digits may result in unexpected findings because such characters 1786 # are treated as whitespace. The 1787 # [limits](https://cloud.google.com/dlp/limits) page contains details about 1788 # the size limits of dictionaries. For dictionaries that do not fit within 1789 # these constraints, consider using `LargeCustomDictionaryConfig` in the 1790 # `StoredInfoType` API. 1791 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. 1792 "words": [ # Words or phrases defining the dictionary. The dictionary must contain 1793 # at least one phrase and every phrase must contain at least 2 characters 1794 # that are letters or digits. [required] 1795 "A String", 1796 ], 1797 }, 1798 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file 1799 # is accepted. 1800 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. 1801 # Example: gs://[BUCKET_NAME]/dictionary.txt 1802 }, 1803 }, 1804 "matchingType": "A String", # How the rule is applied, see MatchingType documentation for details. 1805 }, 1806 }, 1807 ], 1808 "infoTypes": [ # List of infoTypes this rule set is applied to. 1809 { # Type of information detected by the API. 1810 "name": "A String", # Name of the information type. Either a name of your choosing when 1811 # creating a CustomInfoType, or one of the names listed 1812 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 1813 # a built-in type. InfoType names should conform to the pattern 1814 # [a-zA-Z0-9_]{1,64}. 1815 }, 1816 ], 1817 }, 1818 ], 1819 "contentOptions": [ # List of options defining data content to scan. 1820 # If empty, text, images, and other content will be included. 1821 "A String", 1822 ], 1823 "infoTypes": [ # Restricts what info_types to look for. The values must correspond to 1824 # InfoType values returned by ListInfoTypes or listed at 1825 # https://cloud.google.com/dlp/docs/infotypes-reference. 1826 # 1827 # When no InfoTypes or CustomInfoTypes are specified in a request, the 1828 # system may automatically choose what detectors to run. By default this may 1829 # be all types, but may change over time as detectors are updated. 1830 # 1831 # The special InfoType name "ALL_BASIC" can be used to trigger all detectors, 1832 # but may change over time as new InfoTypes are added. If you need precise 1833 # control and predictability as to what detectors are run you should specify 1834 # specific InfoTypes listed in the reference. 1835 { # Type of information detected by the API. 1836 "name": "A String", # Name of the information type. Either a name of your choosing when 1837 # creating a CustomInfoType, or one of the names listed 1838 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 1839 # a built-in type. InfoType names should conform to the pattern 1840 # [a-zA-Z0-9_]{1,64}. 1841 }, 1842 ], 1843 }, 1844 "inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig. 1845 # `inspect_config` will be merged into the values persisted as part of the 1846 # template. 1847 "actions": [ # Actions to execute at the completion of the job. 1848 { # A task to execute on the completion of a job. 1849 # See https://cloud.google.com/dlp/docs/concepts-actions to learn more. 1850 "saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location. 1851 # OutputStorageConfig. Only a single instance of this action can be 1852 # specified. 1853 # Compatible with: Inspect, Risk 1854 "outputConfig": { # Cloud repository for storing output. 1855 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing 1856 # dataset. If table_id is not set a new one will be generated 1857 # for you with the following format: 1858 # dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for 1859 # generating the date details. 1860 # 1861 # For Inspect, each column in an existing output table must have the same 1862 # name, type, and mode of a field in the `Finding` object. 1863 # 1864 # For Risk, an existing output table should be the output of a previous 1865 # Risk analysis job run on the same source table, with the same privacy 1866 # metric and quasi-identifiers. Risk jobs that analyze the same table but 1867 # compute a different privacy metric, or use different sets of 1868 # quasi-identifiers, cannot store their results in the same table. 1869 # identified by its project_id, dataset_id, and table_name. Within a query 1870 # a table is often referenced with a string in the format of: 1871 # `<project_id>:<dataset_id>.<table_id>` or 1872 # `<project_id>.<dataset_id>.<table_id>`. 1873 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. 1874 # If omitted, project ID is inferred from the API call. 1875 "tableId": "A String", # Name of the table. 1876 "datasetId": "A String", # Dataset ID of the table. 1877 }, 1878 "outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only 1879 # used for Inspect and must be unspecified for Risk jobs. Columns are derived 1880 # from the `Finding` object. If appending to an existing table, any columns 1881 # from the predefined schema that are missing will be added. No columns in 1882 # the existing table will be deleted. 1883 # 1884 # If unspecified, then all available columns will be used for a new table or 1885 # an (existing) table with no schema, and no changes will be made to an 1886 # existing table that has a schema. 1887 }, 1888 }, 1889 "jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification to project owners and editors on job's 1890 # completion/failure. 1891 # completion/failure. 1892 }, 1893 "publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha). 1894 # Command Center (CSCC Alpha). 1895 # This action is only available for projects which are parts of 1896 # an organization and whitelisted for the alpha Cloud Security Command 1897 # Center. 1898 # The action will publish count of finding instances and their info types. 1899 # The summary of findings will be persisted in CSCC and are governed by CSCC 1900 # service-specific policy, see https://cloud.google.com/terms/service-terms 1901 # Only a single instance of this action can be specified. 1902 # Compatible with: Inspect 1903 }, 1904 "pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic. 1905 # message contains a single field, `DlpJobName`, which is equal to the 1906 # finished job's 1907 # [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob). 1908 # Compatible with: Inspect, Risk 1909 "topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given 1910 # publishing access rights to the DLP API service account executing 1911 # the long running DlpJob sending the notifications. 1912 # Format is projects/{project}/topics/{topic}. 1913 }, 1914 }, 1915 ], 1916 }, 1917 }, 1918 "result": { # All result fields mentioned below are updated while the job is processing. # A summary of the outcome of this inspect job. 1919 "infoTypeStats": [ # Statistics of how many instances of each info type were found during 1920 # inspect job. 1921 { # Statistics regarding a specific InfoType. 1922 "count": "A String", # Number of findings for this infoType. 1923 "infoType": { # Type of information detected by the API. # The type of finding this stat is for. 1924 "name": "A String", # Name of the information type. Either a name of your choosing when 1925 # creating a CustomInfoType, or one of the names listed 1926 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 1927 # a built-in type. InfoType names should conform to the pattern 1928 # [a-zA-Z0-9_]{1,64}. 1929 }, 1930 }, 1931 ], 1932 "totalEstimatedBytes": "A String", # Estimate of the number of bytes to process. 1933 "processedBytes": "A String", # Total size in bytes that were processed. 1934 }, 1935 }, 1936 "riskDetails": { # Result of a risk analysis operation request. # Results from analyzing risk of a data source. 1937 "numericalStatsResult": { # Result of the numerical stats computation. 1938 "quantileValues": [ # List of 99 values that partition the set of field values into 100 equal 1939 # sized buckets. 1940 { # Set of primitive values supported by the system. 1941 # Note that for the purposes of inspection or transformation, the number 1942 # of bytes considered to comprise a 'Value' is based on its representation 1943 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 1944 # 123456789, the number of bytes would be counted as 9, even though an 1945 # int64 only holds up to 8 bytes of data. 1946 "floatValue": 3.14, 1947 "timestampValue": "A String", 1948 "dayOfWeekValue": "A String", 1949 "timeValue": { # Represents a time of day. The date and time zone are either not significant 1950 # or are specified elsewhere. An API may choose to allow leap seconds. Related 1951 # types are google.type.Date and `google.protobuf.Timestamp`. 1952 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 1953 # to allow the value "24:00:00" for scenarios like business closing time. 1954 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 1955 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 1956 # allow the value 60 if it allows leap-seconds. 1957 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 1958 }, 1959 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 1960 # and time zone are either specified elsewhere or are not significant. The date 1961 # is relative to the Proleptic Gregorian Calendar. This can represent: 1962 # 1963 # * A full date, with non-zero year, month and day values 1964 # * A month and day value, with a zero year, e.g. an anniversary 1965 # * A year on its own, with zero month and day values 1966 # * A year and month value, with a zero day, e.g. a credit card expiration date 1967 # 1968 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 1969 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 1970 # a year. 1971 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 1972 # if specifying a year by itself or a year and month where the day is not 1973 # significant. 1974 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 1975 # month and day. 1976 }, 1977 "stringValue": "A String", 1978 "booleanValue": True or False, 1979 "integerValue": "A String", 1980 }, 1981 ], 1982 "maxValue": { # Set of primitive values supported by the system. # Maximum value appearing in the column. 1983 # Note that for the purposes of inspection or transformation, the number 1984 # of bytes considered to comprise a 'Value' is based on its representation 1985 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 1986 # 123456789, the number of bytes would be counted as 9, even though an 1987 # int64 only holds up to 8 bytes of data. 1988 "floatValue": 3.14, 1989 "timestampValue": "A String", 1990 "dayOfWeekValue": "A String", 1991 "timeValue": { # Represents a time of day. The date and time zone are either not significant 1992 # or are specified elsewhere. An API may choose to allow leap seconds. Related 1993 # types are google.type.Date and `google.protobuf.Timestamp`. 1994 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 1995 # to allow the value "24:00:00" for scenarios like business closing time. 1996 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 1997 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 1998 # allow the value 60 if it allows leap-seconds. 1999 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 2000 }, 2001 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 2002 # and time zone are either specified elsewhere or are not significant. The date 2003 # is relative to the Proleptic Gregorian Calendar. This can represent: 2004 # 2005 # * A full date, with non-zero year, month and day values 2006 # * A month and day value, with a zero year, e.g. an anniversary 2007 # * A year on its own, with zero month and day values 2008 # * A year and month value, with a zero day, e.g. a credit card expiration date 2009 # 2010 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 2011 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 2012 # a year. 2013 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 2014 # if specifying a year by itself or a year and month where the day is not 2015 # significant. 2016 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 2017 # month and day. 2018 }, 2019 "stringValue": "A String", 2020 "booleanValue": True or False, 2021 "integerValue": "A String", 2022 }, 2023 "minValue": { # Set of primitive values supported by the system. # Minimum value appearing in the column. 2024 # Note that for the purposes of inspection or transformation, the number 2025 # of bytes considered to comprise a 'Value' is based on its representation 2026 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 2027 # 123456789, the number of bytes would be counted as 9, even though an 2028 # int64 only holds up to 8 bytes of data. 2029 "floatValue": 3.14, 2030 "timestampValue": "A String", 2031 "dayOfWeekValue": "A String", 2032 "timeValue": { # Represents a time of day. The date and time zone are either not significant 2033 # or are specified elsewhere. An API may choose to allow leap seconds. Related 2034 # types are google.type.Date and `google.protobuf.Timestamp`. 2035 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 2036 # to allow the value "24:00:00" for scenarios like business closing time. 2037 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 2038 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 2039 # allow the value 60 if it allows leap-seconds. 2040 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 2041 }, 2042 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 2043 # and time zone are either specified elsewhere or are not significant. The date 2044 # is relative to the Proleptic Gregorian Calendar. This can represent: 2045 # 2046 # * A full date, with non-zero year, month and day values 2047 # * A month and day value, with a zero year, e.g. an anniversary 2048 # * A year on its own, with zero month and day values 2049 # * A year and month value, with a zero day, e.g. a credit card expiration date 2050 # 2051 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 2052 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 2053 # a year. 2054 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 2055 # if specifying a year by itself or a year and month where the day is not 2056 # significant. 2057 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 2058 # month and day. 2059 }, 2060 "stringValue": "A String", 2061 "booleanValue": True or False, 2062 "integerValue": "A String", 2063 }, 2064 }, 2065 "kMapEstimationResult": { # Result of the reidentifiability analysis. Note that these results are an 2066 # estimation, not exact values. 2067 "kMapEstimationHistogram": [ # The intervals [min_anonymity, max_anonymity] do not overlap. If a value 2068 # doesn't correspond to any such interval, the associated frequency is 2069 # zero. For example, the following records: 2070 # {min_anonymity: 1, max_anonymity: 1, frequency: 17} 2071 # {min_anonymity: 2, max_anonymity: 3, frequency: 42} 2072 # {min_anonymity: 5, max_anonymity: 10, frequency: 99} 2073 # mean that there are no record with an estimated anonymity of 4, 5, or 2074 # larger than 10. 2075 { # A KMapEstimationHistogramBucket message with the following values: 2076 # min_anonymity: 3 2077 # max_anonymity: 5 2078 # frequency: 42 2079 # means that there are 42 records whose quasi-identifier values correspond 2080 # to 3, 4 or 5 people in the overlying population. An important particular 2081 # case is when min_anonymity = max_anonymity = 1: the frequency field then 2082 # corresponds to the number of uniquely identifiable records. 2083 "bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total 2084 # number of classes returned per bucket is capped at 20. 2085 { # A tuple of values for the quasi-identifier columns. 2086 "estimatedAnonymity": "A String", # The estimated anonymity for these quasi-identifier values. 2087 "quasiIdsValues": [ # The quasi-identifier values. 2088 { # Set of primitive values supported by the system. 2089 # Note that for the purposes of inspection or transformation, the number 2090 # of bytes considered to comprise a 'Value' is based on its representation 2091 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 2092 # 123456789, the number of bytes would be counted as 9, even though an 2093 # int64 only holds up to 8 bytes of data. 2094 "floatValue": 3.14, 2095 "timestampValue": "A String", 2096 "dayOfWeekValue": "A String", 2097 "timeValue": { # Represents a time of day. The date and time zone are either not significant 2098 # or are specified elsewhere. An API may choose to allow leap seconds. Related 2099 # types are google.type.Date and `google.protobuf.Timestamp`. 2100 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 2101 # to allow the value "24:00:00" for scenarios like business closing time. 2102 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 2103 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 2104 # allow the value 60 if it allows leap-seconds. 2105 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 2106 }, 2107 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 2108 # and time zone are either specified elsewhere or are not significant. The date 2109 # is relative to the Proleptic Gregorian Calendar. This can represent: 2110 # 2111 # * A full date, with non-zero year, month and day values 2112 # * A month and day value, with a zero year, e.g. an anniversary 2113 # * A year on its own, with zero month and day values 2114 # * A year and month value, with a zero day, e.g. a credit card expiration date 2115 # 2116 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 2117 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 2118 # a year. 2119 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 2120 # if specifying a year by itself or a year and month where the day is not 2121 # significant. 2122 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 2123 # month and day. 2124 }, 2125 "stringValue": "A String", 2126 "booleanValue": True or False, 2127 "integerValue": "A String", 2128 }, 2129 ], 2130 }, 2131 ], 2132 "minAnonymity": "A String", # Always positive. 2133 "bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket. 2134 "maxAnonymity": "A String", # Always greater than or equal to min_anonymity. 2135 "bucketSize": "A String", # Number of records within these anonymity bounds. 2136 }, 2137 ], 2138 }, 2139 "kAnonymityResult": { # Result of the k-anonymity computation. 2140 "equivalenceClassHistogramBuckets": [ # Histogram of k-anonymity equivalence classes. 2141 { 2142 "bucketValues": [ # Sample of equivalence classes in this bucket. The total number of 2143 # classes returned per bucket is capped at 20. 2144 { # The set of columns' values that share the same ldiversity value 2145 "quasiIdsValues": [ # Set of values defining the equivalence class. One value per 2146 # quasi-identifier column in the original KAnonymity metric message. 2147 # The order is always the same as the original request. 2148 { # Set of primitive values supported by the system. 2149 # Note that for the purposes of inspection or transformation, the number 2150 # of bytes considered to comprise a 'Value' is based on its representation 2151 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 2152 # 123456789, the number of bytes would be counted as 9, even though an 2153 # int64 only holds up to 8 bytes of data. 2154 "floatValue": 3.14, 2155 "timestampValue": "A String", 2156 "dayOfWeekValue": "A String", 2157 "timeValue": { # Represents a time of day. The date and time zone are either not significant 2158 # or are specified elsewhere. An API may choose to allow leap seconds. Related 2159 # types are google.type.Date and `google.protobuf.Timestamp`. 2160 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 2161 # to allow the value "24:00:00" for scenarios like business closing time. 2162 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 2163 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 2164 # allow the value 60 if it allows leap-seconds. 2165 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 2166 }, 2167 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 2168 # and time zone are either specified elsewhere or are not significant. The date 2169 # is relative to the Proleptic Gregorian Calendar. This can represent: 2170 # 2171 # * A full date, with non-zero year, month and day values 2172 # * A month and day value, with a zero year, e.g. an anniversary 2173 # * A year on its own, with zero month and day values 2174 # * A year and month value, with a zero day, e.g. a credit card expiration date 2175 # 2176 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 2177 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 2178 # a year. 2179 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 2180 # if specifying a year by itself or a year and month where the day is not 2181 # significant. 2182 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 2183 # month and day. 2184 }, 2185 "stringValue": "A String", 2186 "booleanValue": True or False, 2187 "integerValue": "A String", 2188 }, 2189 ], 2190 "equivalenceClassSize": "A String", # Size of the equivalence class, for example number of rows with the 2191 # above set of values. 2192 }, 2193 ], 2194 "bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket. 2195 "equivalenceClassSizeLowerBound": "A String", # Lower bound on the size of the equivalence classes in this bucket. 2196 "equivalenceClassSizeUpperBound": "A String", # Upper bound on the size of the equivalence classes in this bucket. 2197 "bucketSize": "A String", # Total number of equivalence classes in this bucket. 2198 }, 2199 ], 2200 }, 2201 "lDiversityResult": { # Result of the l-diversity computation. 2202 "sensitiveValueFrequencyHistogramBuckets": [ # Histogram of l-diversity equivalence class sensitive value frequencies. 2203 { 2204 "bucketValues": [ # Sample of equivalence classes in this bucket. The total number of 2205 # classes returned per bucket is capped at 20. 2206 { # The set of columns' values that share the same ldiversity value. 2207 "numDistinctSensitiveValues": "A String", # Number of distinct sensitive values in this equivalence class. 2208 "quasiIdsValues": [ # Quasi-identifier values defining the k-anonymity equivalence 2209 # class. The order is always the same as the original request. 2210 { # Set of primitive values supported by the system. 2211 # Note that for the purposes of inspection or transformation, the number 2212 # of bytes considered to comprise a 'Value' is based on its representation 2213 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 2214 # 123456789, the number of bytes would be counted as 9, even though an 2215 # int64 only holds up to 8 bytes of data. 2216 "floatValue": 3.14, 2217 "timestampValue": "A String", 2218 "dayOfWeekValue": "A String", 2219 "timeValue": { # Represents a time of day. The date and time zone are either not significant 2220 # or are specified elsewhere. An API may choose to allow leap seconds. Related 2221 # types are google.type.Date and `google.protobuf.Timestamp`. 2222 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 2223 # to allow the value "24:00:00" for scenarios like business closing time. 2224 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 2225 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 2226 # allow the value 60 if it allows leap-seconds. 2227 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 2228 }, 2229 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 2230 # and time zone are either specified elsewhere or are not significant. The date 2231 # is relative to the Proleptic Gregorian Calendar. This can represent: 2232 # 2233 # * A full date, with non-zero year, month and day values 2234 # * A month and day value, with a zero year, e.g. an anniversary 2235 # * A year on its own, with zero month and day values 2236 # * A year and month value, with a zero day, e.g. a credit card expiration date 2237 # 2238 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 2239 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 2240 # a year. 2241 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 2242 # if specifying a year by itself or a year and month where the day is not 2243 # significant. 2244 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 2245 # month and day. 2246 }, 2247 "stringValue": "A String", 2248 "booleanValue": True or False, 2249 "integerValue": "A String", 2250 }, 2251 ], 2252 "topSensitiveValues": [ # Estimated frequencies of top sensitive values. 2253 { # A value of a field, including its frequency. 2254 "count": "A String", # How many times the value is contained in the field. 2255 "value": { # Set of primitive values supported by the system. # A value contained in the field in question. 2256 # Note that for the purposes of inspection or transformation, the number 2257 # of bytes considered to comprise a 'Value' is based on its representation 2258 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 2259 # 123456789, the number of bytes would be counted as 9, even though an 2260 # int64 only holds up to 8 bytes of data. 2261 "floatValue": 3.14, 2262 "timestampValue": "A String", 2263 "dayOfWeekValue": "A String", 2264 "timeValue": { # Represents a time of day. The date and time zone are either not significant 2265 # or are specified elsewhere. An API may choose to allow leap seconds. Related 2266 # types are google.type.Date and `google.protobuf.Timestamp`. 2267 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 2268 # to allow the value "24:00:00" for scenarios like business closing time. 2269 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 2270 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 2271 # allow the value 60 if it allows leap-seconds. 2272 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 2273 }, 2274 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 2275 # and time zone are either specified elsewhere or are not significant. The date 2276 # is relative to the Proleptic Gregorian Calendar. This can represent: 2277 # 2278 # * A full date, with non-zero year, month and day values 2279 # * A month and day value, with a zero year, e.g. an anniversary 2280 # * A year on its own, with zero month and day values 2281 # * A year and month value, with a zero day, e.g. a credit card expiration date 2282 # 2283 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 2284 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 2285 # a year. 2286 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 2287 # if specifying a year by itself or a year and month where the day is not 2288 # significant. 2289 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 2290 # month and day. 2291 }, 2292 "stringValue": "A String", 2293 "booleanValue": True or False, 2294 "integerValue": "A String", 2295 }, 2296 }, 2297 ], 2298 "equivalenceClassSize": "A String", # Size of the k-anonymity equivalence class. 2299 }, 2300 ], 2301 "bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket. 2302 "bucketSize": "A String", # Total number of equivalence classes in this bucket. 2303 "sensitiveValueFrequencyUpperBound": "A String", # Upper bound on the sensitive value frequencies of the equivalence 2304 # classes in this bucket. 2305 "sensitiveValueFrequencyLowerBound": "A String", # Lower bound on the sensitive value frequencies of the equivalence 2306 # classes in this bucket. 2307 }, 2308 ], 2309 }, 2310 "requestedPrivacyMetric": { # Privacy metric to compute for reidentification risk analysis. # Privacy metric to compute. 2311 "numericalStatsConfig": { # Compute numerical stats over an individual column, including 2312 # min, max, and quantiles. 2313 "field": { # General identifier of a data field in a storage service. # Field to compute numerical stats on. Supported types are 2314 # integer, float, date, datetime, timestamp, time. 2315 "name": "A String", # Name describing the field. 2316 }, 2317 }, 2318 "kMapEstimationConfig": { # Reidentifiability metric. This corresponds to a risk model similar to what 2319 # is called "journalist risk" in the literature, except the attack dataset is 2320 # statistically modeled instead of being perfectly known. This can be done 2321 # using publicly available data (like the US Census), or using a custom 2322 # statistical model (indicated as one or several BigQuery tables), or by 2323 # extrapolating from the distribution of values in the input dataset. 2324 # A column with a semantic tag attached. 2325 "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling. 2326 # Required if no column is tagged with a region-specific InfoType (like 2327 # US_ZIP_5) or a region code. 2328 "quasiIds": [ # Fields considered to be quasi-identifiers. No two columns can have the 2329 # same tag. [required] 2330 { 2331 "field": { # General identifier of a data field in a storage service. # Identifies the column. [required] 2332 "name": "A String", # Name describing the field. 2333 }, 2334 "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must 2335 # indicate an auxiliary table that contains statistical information on 2336 # the possible values of this column (below). 2337 "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public 2338 # dataset as a statistical model of population, if available. We 2339 # currently support US ZIP codes, region codes, ages and genders. 2340 # To programmatically obtain the list of supported InfoTypes, use 2341 # ListInfoTypes with the supported_by=RISK_ANALYSIS filter. 2342 "name": "A String", # Name of the information type. Either a name of your choosing when 2343 # creating a CustomInfoType, or one of the names listed 2344 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 2345 # a built-in type. InfoType names should conform to the pattern 2346 # [a-zA-Z0-9_]{1,64}. 2347 }, 2348 "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from 2349 # the distribution of values in the input data 2350 # empty messages in your APIs. A typical example is to use it as the request 2351 # or the response type of an API method. For instance: 2352 # 2353 # service Foo { 2354 # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); 2355 # } 2356 # 2357 # The JSON representation for `Empty` is empty JSON object `{}`. 2358 }, 2359 }, 2360 ], 2361 "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag 2362 # used to tag a quasi-identifiers column must appear in exactly one column 2363 # of one auxiliary table. 2364 { # An auxiliary table contains statistical information on the relative 2365 # frequency of different quasi-identifiers values. It has one or several 2366 # quasi-identifiers columns, and one column that indicates the relative 2367 # frequency of each quasi-identifier tuple. 2368 # If a tuple is present in the data but not in the auxiliary table, the 2369 # corresponding relative frequency is assumed to be zero (and thus, the 2370 # tuple is highly reidentifiable). 2371 "relativeFrequency": { # General identifier of a data field in a storage service. # The relative frequency column must contain a floating-point number 2372 # between 0 and 1 (inclusive). Null values are assumed to be zero. 2373 # [required] 2374 "name": "A String", # Name describing the field. 2375 }, 2376 "quasiIds": [ # Quasi-identifier columns. [required] 2377 { # A quasi-identifier column has a custom_tag, used to know which column 2378 # in the data corresponds to which column in the statistical model. 2379 "field": { # General identifier of a data field in a storage service. 2380 "name": "A String", # Name describing the field. 2381 }, 2382 "customTag": "A String", 2383 }, 2384 ], 2385 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Auxiliary table location. [required] 2386 # identified by its project_id, dataset_id, and table_name. Within a query 2387 # a table is often referenced with a string in the format of: 2388 # `<project_id>:<dataset_id>.<table_id>` or 2389 # `<project_id>.<dataset_id>.<table_id>`. 2390 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. 2391 # If omitted, project ID is inferred from the API call. 2392 "tableId": "A String", # Name of the table. 2393 "datasetId": "A String", # Dataset ID of the table. 2394 }, 2395 }, 2396 ], 2397 }, 2398 "lDiversityConfig": { # l-diversity metric, used for analysis of reidentification risk. 2399 "sensitiveAttribute": { # General identifier of a data field in a storage service. # Sensitive field for computing the l-value. 2400 "name": "A String", # Name describing the field. 2401 }, 2402 "quasiIds": [ # Set of quasi-identifiers indicating how equivalence classes are 2403 # defined for the l-diversity computation. When multiple fields are 2404 # specified, they are considered a single composite key. 2405 { # General identifier of a data field in a storage service. 2406 "name": "A String", # Name describing the field. 2407 }, 2408 ], 2409 }, 2410 "deltaPresenceEstimationConfig": { # δ-presence metric, used to estimate how likely it is for an attacker to 2411 # figure out that one given individual appears in a de-identified dataset. 2412 # Similarly to the k-map metric, we cannot compute δ-presence exactly without 2413 # knowing the attack dataset, so we use a statistical model instead. 2414 "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling. 2415 # Required if no column is tagged with a region-specific InfoType (like 2416 # US_ZIP_5) or a region code. 2417 "quasiIds": [ # Fields considered to be quasi-identifiers. No two fields can have the 2418 # same tag. [required] 2419 { # A column with a semantic tag attached. 2420 "field": { # General identifier of a data field in a storage service. # Identifies the column. [required] 2421 "name": "A String", # Name describing the field. 2422 }, 2423 "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must 2424 # indicate an auxiliary table that contains statistical information on 2425 # the possible values of this column (below). 2426 "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public 2427 # dataset as a statistical model of population, if available. We 2428 # currently support US ZIP codes, region codes, ages and genders. 2429 # To programmatically obtain the list of supported InfoTypes, use 2430 # ListInfoTypes with the supported_by=RISK_ANALYSIS filter. 2431 "name": "A String", # Name of the information type. Either a name of your choosing when 2432 # creating a CustomInfoType, or one of the names listed 2433 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 2434 # a built-in type. InfoType names should conform to the pattern 2435 # [a-zA-Z0-9_]{1,64}. 2436 }, 2437 "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from 2438 # the distribution of values in the input data 2439 # empty messages in your APIs. A typical example is to use it as the request 2440 # or the response type of an API method. For instance: 2441 # 2442 # service Foo { 2443 # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); 2444 # } 2445 # 2446 # The JSON representation for `Empty` is empty JSON object `{}`. 2447 }, 2448 }, 2449 ], 2450 "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag 2451 # used to tag a quasi-identifiers field must appear in exactly one 2452 # field of one auxiliary table. 2453 { # An auxiliary table containing statistical information on the relative 2454 # frequency of different quasi-identifiers values. It has one or several 2455 # quasi-identifiers columns, and one column that indicates the relative 2456 # frequency of each quasi-identifier tuple. 2457 # If a tuple is present in the data but not in the auxiliary table, the 2458 # corresponding relative frequency is assumed to be zero (and thus, the 2459 # tuple is highly reidentifiable). 2460 "relativeFrequency": { # General identifier of a data field in a storage service. # The relative frequency column must contain a floating-point number 2461 # between 0 and 1 (inclusive). Null values are assumed to be zero. 2462 # [required] 2463 "name": "A String", # Name describing the field. 2464 }, 2465 "quasiIds": [ # Quasi-identifier columns. [required] 2466 { # A quasi-identifier column has a custom_tag, used to know which column 2467 # in the data corresponds to which column in the statistical model. 2468 "field": { # General identifier of a data field in a storage service. 2469 "name": "A String", # Name describing the field. 2470 }, 2471 "customTag": "A String", 2472 }, 2473 ], 2474 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Auxiliary table location. [required] 2475 # identified by its project_id, dataset_id, and table_name. Within a query 2476 # a table is often referenced with a string in the format of: 2477 # `<project_id>:<dataset_id>.<table_id>` or 2478 # `<project_id>.<dataset_id>.<table_id>`. 2479 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. 2480 # If omitted, project ID is inferred from the API call. 2481 "tableId": "A String", # Name of the table. 2482 "datasetId": "A String", # Dataset ID of the table. 2483 }, 2484 }, 2485 ], 2486 }, 2487 "categoricalStatsConfig": { # Compute numerical stats over an individual column, including 2488 # number of distinct values and value count distribution. 2489 "field": { # General identifier of a data field in a storage service. # Field to compute categorical stats on. All column types are 2490 # supported except for arrays and structs. However, it may be more 2491 # informative to use NumericalStats when the field type is supported, 2492 # depending on the data. 2493 "name": "A String", # Name describing the field. 2494 }, 2495 }, 2496 "kAnonymityConfig": { # k-anonymity metric, used for analysis of reidentification risk. 2497 "entityId": { # An entity in a dataset is a field or set of fields that correspond to a # Optional message indicating that multiple rows might be associated to a 2498 # single individual. If the same entity_id is associated to multiple 2499 # quasi-identifier tuples over distinct rows, we consider the entire 2500 # collection of tuples as the composite quasi-identifier. This collection 2501 # is a multiset: the order in which the different tuples appear in the 2502 # dataset is ignored, but their frequency is taken into account. 2503 # 2504 # Important note: a maximum of 1000 rows can be associated to a single 2505 # entity ID. If more rows are associated with the same entity ID, some 2506 # might be ignored. 2507 # single person. For example, in medical records the `EntityId` might be a 2508 # patient identifier, or for financial records it might be an account 2509 # identifier. This message is used when generalizations or analysis must take 2510 # into account that multiple rows correspond to the same entity. 2511 "field": { # General identifier of a data field in a storage service. # Composite key indicating which field contains the entity identifier. 2512 "name": "A String", # Name describing the field. 2513 }, 2514 }, 2515 "quasiIds": [ # Set of fields to compute k-anonymity over. When multiple fields are 2516 # specified, they are considered a single composite key. Structs and 2517 # repeated data types are not supported; however, nested fields are 2518 # supported so long as they are not structs themselves or nested within 2519 # a repeated field. 2520 { # General identifier of a data field in a storage service. 2521 "name": "A String", # Name describing the field. 2522 }, 2523 ], 2524 }, 2525 }, 2526 "categoricalStatsResult": { # Result of the categorical stats computation. 2527 "valueFrequencyHistogramBuckets": [ # Histogram of value frequencies in the column. 2528 { 2529 "bucketValues": [ # Sample of value frequencies in this bucket. The total number of 2530 # values returned per bucket is capped at 20. 2531 { # A value of a field, including its frequency. 2532 "count": "A String", # How many times the value is contained in the field. 2533 "value": { # Set of primitive values supported by the system. # A value contained in the field in question. 2534 # Note that for the purposes of inspection or transformation, the number 2535 # of bytes considered to comprise a 'Value' is based on its representation 2536 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 2537 # 123456789, the number of bytes would be counted as 9, even though an 2538 # int64 only holds up to 8 bytes of data. 2539 "floatValue": 3.14, 2540 "timestampValue": "A String", 2541 "dayOfWeekValue": "A String", 2542 "timeValue": { # Represents a time of day. The date and time zone are either not significant 2543 # or are specified elsewhere. An API may choose to allow leap seconds. Related 2544 # types are google.type.Date and `google.protobuf.Timestamp`. 2545 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 2546 # to allow the value "24:00:00" for scenarios like business closing time. 2547 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 2548 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 2549 # allow the value 60 if it allows leap-seconds. 2550 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 2551 }, 2552 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 2553 # and time zone are either specified elsewhere or are not significant. The date 2554 # is relative to the Proleptic Gregorian Calendar. This can represent: 2555 # 2556 # * A full date, with non-zero year, month and day values 2557 # * A month and day value, with a zero year, e.g. an anniversary 2558 # * A year on its own, with zero month and day values 2559 # * A year and month value, with a zero day, e.g. a credit card expiration date 2560 # 2561 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 2562 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 2563 # a year. 2564 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 2565 # if specifying a year by itself or a year and month where the day is not 2566 # significant. 2567 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 2568 # month and day. 2569 }, 2570 "stringValue": "A String", 2571 "booleanValue": True or False, 2572 "integerValue": "A String", 2573 }, 2574 }, 2575 ], 2576 "bucketValueCount": "A String", # Total number of distinct values in this bucket. 2577 "valueFrequencyUpperBound": "A String", # Upper bound on the value frequency of the values in this bucket. 2578 "valueFrequencyLowerBound": "A String", # Lower bound on the value frequency of the values in this bucket. 2579 "bucketSize": "A String", # Total number of values in this bucket. 2580 }, 2581 ], 2582 }, 2583 "deltaPresenceEstimationResult": { # Result of the δ-presence computation. Note that these results are an 2584 # estimation, not exact values. 2585 "deltaPresenceEstimationHistogram": [ # The intervals [min_probability, max_probability) do not overlap. If a 2586 # value doesn't correspond to any such interval, the associated frequency 2587 # is zero. For example, the following records: 2588 # {min_probability: 0, max_probability: 0.1, frequency: 17} 2589 # {min_probability: 0.2, max_probability: 0.3, frequency: 42} 2590 # {min_probability: 0.3, max_probability: 0.4, frequency: 99} 2591 # mean that there are no record with an estimated probability in [0.1, 0.2) 2592 # nor larger or equal to 0.4. 2593 { # A DeltaPresenceEstimationHistogramBucket message with the following 2594 # values: 2595 # min_probability: 0.1 2596 # max_probability: 0.2 2597 # frequency: 42 2598 # means that there are 42 records for which δ is in [0.1, 0.2). An 2599 # important particular case is when min_probability = max_probability = 1: 2600 # then, every individual who shares this quasi-identifier combination is in 2601 # the dataset. 2602 "bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total 2603 # number of classes returned per bucket is capped at 20. 2604 { # A tuple of values for the quasi-identifier columns. 2605 "quasiIdsValues": [ # The quasi-identifier values. 2606 { # Set of primitive values supported by the system. 2607 # Note that for the purposes of inspection or transformation, the number 2608 # of bytes considered to comprise a 'Value' is based on its representation 2609 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 2610 # 123456789, the number of bytes would be counted as 9, even though an 2611 # int64 only holds up to 8 bytes of data. 2612 "floatValue": 3.14, 2613 "timestampValue": "A String", 2614 "dayOfWeekValue": "A String", 2615 "timeValue": { # Represents a time of day. The date and time zone are either not significant 2616 # or are specified elsewhere. An API may choose to allow leap seconds. Related 2617 # types are google.type.Date and `google.protobuf.Timestamp`. 2618 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 2619 # to allow the value "24:00:00" for scenarios like business closing time. 2620 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 2621 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 2622 # allow the value 60 if it allows leap-seconds. 2623 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 2624 }, 2625 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 2626 # and time zone are either specified elsewhere or are not significant. The date 2627 # is relative to the Proleptic Gregorian Calendar. This can represent: 2628 # 2629 # * A full date, with non-zero year, month and day values 2630 # * A month and day value, with a zero year, e.g. an anniversary 2631 # * A year on its own, with zero month and day values 2632 # * A year and month value, with a zero day, e.g. a credit card expiration date 2633 # 2634 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 2635 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 2636 # a year. 2637 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 2638 # if specifying a year by itself or a year and month where the day is not 2639 # significant. 2640 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 2641 # month and day. 2642 }, 2643 "stringValue": "A String", 2644 "booleanValue": True or False, 2645 "integerValue": "A String", 2646 }, 2647 ], 2648 "estimatedProbability": 3.14, # The estimated probability that a given individual sharing these 2649 # quasi-identifier values is in the dataset. This value, typically called 2650 # δ, is the ratio between the number of records in the dataset with these 2651 # quasi-identifier values, and the total number of individuals (inside 2652 # *and* outside the dataset) with these quasi-identifier values. 2653 # For example, if there are 15 individuals in the dataset who share the 2654 # same quasi-identifier values, and an estimated 100 people in the entire 2655 # population with these values, then δ is 0.15. 2656 }, 2657 ], 2658 "bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket. 2659 "bucketSize": "A String", # Number of records within these probability bounds. 2660 "maxProbability": 3.14, # Always greater than or equal to min_probability. 2661 "minProbability": 3.14, # Between 0 and 1. 2662 }, 2663 ], 2664 }, 2665 "requestedSourceTable": { # Message defining the location of a BigQuery table. A table is uniquely # Input dataset to compute metrics over. 2666 # identified by its project_id, dataset_id, and table_name. Within a query 2667 # a table is often referenced with a string in the format of: 2668 # `<project_id>:<dataset_id>.<table_id>` or 2669 # `<project_id>.<dataset_id>.<table_id>`. 2670 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. 2671 # If omitted, project ID is inferred from the API call. 2672 "tableId": "A String", # Name of the table. 2673 "datasetId": "A String", # Dataset ID of the table. 2674 }, 2675 }, 2676 "state": "A String", # State of a job. 2677 "jobTriggerName": "A String", # If created by a job trigger, the resource name of the trigger that 2678 # instantiated the job. 2679 "startTime": "A String", # Time when the job started. 2680 "endTime": "A String", # Time when the job finished. 2681 "type": "A String", # The type of job. 2682 "createTime": "A String", # Time when the job was created. 2683 }</pre> 2684</div> 2685 2686<div class="method"> 2687 <code class="details" id="delete">delete(name, x__xgafv=None)</code> 2688 <pre>Deletes a long-running DlpJob. This method indicates that the client is 2689no longer interested in the DlpJob result. The job will be cancelled if 2690possible. 2691See https://cloud.google.com/dlp/docs/inspecting-storage and 2692https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more. 2693 2694Args: 2695 name: string, The name of the DlpJob resource to be deleted. (required) 2696 x__xgafv: string, V1 error format. 2697 Allowed values 2698 1 - v1 error format 2699 2 - v2 error format 2700 2701Returns: 2702 An object of the form: 2703 2704 { # A generic empty message that you can re-use to avoid defining duplicated 2705 # empty messages in your APIs. A typical example is to use it as the request 2706 # or the response type of an API method. For instance: 2707 # 2708 # service Foo { 2709 # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); 2710 # } 2711 # 2712 # The JSON representation for `Empty` is empty JSON object `{}`. 2713 }</pre> 2714</div> 2715 2716<div class="method"> 2717 <code class="details" id="get">get(name, x__xgafv=None)</code> 2718 <pre>Gets the latest state of a long-running DlpJob. 2719See https://cloud.google.com/dlp/docs/inspecting-storage and 2720https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more. 2721 2722Args: 2723 name: string, The name of the DlpJob resource. (required) 2724 x__xgafv: string, V1 error format. 2725 Allowed values 2726 1 - v1 error format 2727 2 - v2 error format 2728 2729Returns: 2730 An object of the form: 2731 2732 { # Combines all of the information about a DLP job. 2733 "errors": [ # A stream of errors encountered running the job. 2734 { # Details information about an error encountered during job execution or 2735 # the results of an unsuccessful activation of the JobTrigger. 2736 # Output only field. 2737 "timestamps": [ # The times the error occurred. 2738 "A String", 2739 ], 2740 "details": { # The `Status` type defines a logical error model that is suitable for 2741 # different programming environments, including REST APIs and RPC APIs. It is 2742 # used by [gRPC](https://github.com/grpc). Each `Status` message contains 2743 # three pieces of data: error code, error message, and error details. 2744 # 2745 # You can find out more about this error model and how to work with it in the 2746 # [API Design Guide](https://cloud.google.com/apis/design/errors). 2747 "message": "A String", # A developer-facing error message, which should be in English. Any 2748 # user-facing error message should be localized and sent in the 2749 # google.rpc.Status.details field, or localized by the client. 2750 "code": 42, # The status code, which should be an enum value of google.rpc.Code. 2751 "details": [ # A list of messages that carry the error details. There is a common set of 2752 # message types for APIs to use. 2753 { 2754 "a_key": "", # Properties of the object. Contains field @type with type URL. 2755 }, 2756 ], 2757 }, 2758 }, 2759 ], 2760 "name": "A String", # The server-assigned name. 2761 "inspectDetails": { # The results of an inspect DataSource job. # Results from inspecting a data source. 2762 "requestedOptions": { # The configuration used for this job. 2763 "snapshotInspectTemplate": { # The inspectTemplate contains a configuration (set of types of sensitive data # If run with an InspectTemplate, a snapshot of its state at the time of 2764 # this run. 2765 # to be detected) to be used anywhere you otherwise would normally specify 2766 # InspectConfig. See https://cloud.google.com/dlp/docs/concepts-templates 2767 # to learn more. 2768 "updateTime": "A String", # The last update timestamp of a inspectTemplate, output only field. 2769 "displayName": "A String", # Display name (max 256 chars). 2770 "description": "A String", # Short description (max 256 chars). 2771 "inspectConfig": { # Configuration description of the scanning process. # The core content of the template. Configuration of the scanning process. 2772 # When used with redactContent only info_types and min_likelihood are currently 2773 # used. 2774 "excludeInfoTypes": True or False, # When true, excludes type information of the findings. 2775 "limits": { 2776 "maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job. 2777 # When set within `InspectContentRequest`, the maximum returned is 2000 2778 # regardless if this is set higher. 2779 "maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes. 2780 { # Max findings configuration per infoType, per content item or long 2781 # running DlpJob. 2782 "infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per 2783 # info_type should be provided. If InfoTypeLimit does not have an 2784 # info_type, the DLP API applies the limit against all info_types that 2785 # are found but not specified in another InfoTypeLimit. 2786 "name": "A String", # Name of the information type. Either a name of your choosing when 2787 # creating a CustomInfoType, or one of the names listed 2788 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 2789 # a built-in type. InfoType names should conform to the pattern 2790 # [a-zA-Z0-9_]{1,64}. 2791 }, 2792 "maxFindings": 42, # Max findings limit for the given infoType. 2793 }, 2794 ], 2795 "maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned. 2796 # When set within `InspectDataSourceRequest`, 2797 # the maximum returned is 2000 regardless if this is set higher. 2798 # When set within `InspectContentRequest`, this field is ignored. 2799 }, 2800 "minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is 2801 # POSSIBLE. 2802 # See https://cloud.google.com/dlp/docs/likelihood to learn more. 2803 "customInfoTypes": [ # CustomInfoTypes provided by the user. See 2804 # https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more. 2805 { # Custom information type provided by the user. Used to find domain-specific 2806 # sensitive information configurable to the data in question. 2807 "regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType. 2808 "pattern": "A String", # Pattern defining the regular expression. Its syntax 2809 # (https://github.com/google/re2/wiki/Syntax) can be found under the 2810 # google/re2 repository on GitHub. 2811 "groupIndexes": [ # The index of the submatch to extract as findings. When not 2812 # specified, the entire match is returned. No more than 3 may be included. 2813 42, 2814 ], 2815 }, 2816 "surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that 2817 # support reversing. 2818 # such as 2819 # [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig). 2820 # These types of transformations are 2821 # those that perform pseudonymization, thereby producing a "surrogate" as 2822 # output. This should be used in conjunction with a field on the 2823 # transformation such as `surrogate_info_type`. This CustomInfoType does 2824 # not support the use of `detection_rules`. 2825 }, 2826 "infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in 2827 # infoType, when the name matches one of existing infoTypes and that infoType 2828 # is specified in `InspectContent.info_types` field. Specifying the latter 2829 # adds findings to the one detected by the system. If built-in info type is 2830 # not specified in `InspectContent.info_types` list then the name is treated 2831 # as a custom info type. 2832 "name": "A String", # Name of the information type. Either a name of your choosing when 2833 # creating a CustomInfoType, or one of the names listed 2834 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 2835 # a built-in type. InfoType names should conform to the pattern 2836 # [a-zA-Z0-9_]{1,64}. 2837 }, 2838 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType. 2839 # be used to match sensitive information specific to the data, such as a list 2840 # of employee IDs or job titles. 2841 # 2842 # Dictionary words are case-insensitive and all characters other than letters 2843 # and digits in the unicode [Basic Multilingual 2844 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) 2845 # will be replaced with whitespace when scanning for matches, so the 2846 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", 2847 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters 2848 # surrounding any match must be of a different type than the adjacent 2849 # characters within the word, so letters must be next to non-letters and 2850 # digits next to non-digits. For example, the dictionary word "jen" will 2851 # match the first three letters of the text "jen123" but will return no 2852 # matches for "jennifer". 2853 # 2854 # Dictionary words containing a large number of characters that are not 2855 # letters or digits may result in unexpected findings because such characters 2856 # are treated as whitespace. The 2857 # [limits](https://cloud.google.com/dlp/limits) page contains details about 2858 # the size limits of dictionaries. For dictionaries that do not fit within 2859 # these constraints, consider using `LargeCustomDictionaryConfig` in the 2860 # `StoredInfoType` API. 2861 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. 2862 "words": [ # Words or phrases defining the dictionary. The dictionary must contain 2863 # at least one phrase and every phrase must contain at least 2 characters 2864 # that are letters or digits. [required] 2865 "A String", 2866 ], 2867 }, 2868 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file 2869 # is accepted. 2870 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. 2871 # Example: gs://[BUCKET_NAME]/dictionary.txt 2872 }, 2873 }, 2874 "storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in 2875 # `InspectDataSource`. Not currently supported in `InspectContent`. 2876 "name": "A String", # Resource name of the requested `StoredInfoType`, for example 2877 # `organizations/433245324/storedInfoTypes/432452342` or 2878 # `projects/project-id/storedInfoTypes/432452342`. 2879 "createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for 2880 # inspection was created. Output-only field, populated by the system. 2881 }, 2882 "detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType. 2883 # Rules are applied in order that they are specified. Not supported for the 2884 # `surrogate_type` CustomInfoType. 2885 { # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a 2886 # `CustomInfoType` to alter behavior under certain circumstances, depending 2887 # on the specific details of the rule. Not supported for the `surrogate_type` 2888 # custom infoType. 2889 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. 2890 # proximity of hotwords. 2891 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. 2892 # The total length of the window cannot exceed 1000 characters. Note that 2893 # the finding itself will be included in the window, so that hotwords may 2894 # be used to match substrings of the finding itself. For example, the 2895 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be 2896 # adjusted upwards if the area code is known to be the local area code of 2897 # a company office using the hotword regex "\(xxx\)", where "xxx" 2898 # is the area code in question. 2899 # rule. 2900 "windowAfter": 42, # Number of characters after the finding to consider. 2901 "windowBefore": 42, # Number of characters before the finding to consider. 2902 }, 2903 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. 2904 "pattern": "A String", # Pattern defining the regular expression. Its syntax 2905 # (https://github.com/google/re2/wiki/Syntax) can be found under the 2906 # google/re2 repository on GitHub. 2907 "groupIndexes": [ # The index of the submatch to extract as findings. When not 2908 # specified, the entire match is returned. No more than 3 may be included. 2909 42, 2910 ], 2911 }, 2912 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. 2913 # part of a detection rule. 2914 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of 2915 # levels. For example, if a finding would be `POSSIBLE` without the 2916 # detection rule and `relative_likelihood` is 1, then it is upgraded to 2917 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. 2918 # Likelihood may never drop below `VERY_UNLIKELY` or exceed 2919 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an 2920 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in 2921 # a final likelihood of `LIKELY`. 2922 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. 2923 }, 2924 }, 2925 }, 2926 ], 2927 "exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding 2928 # to be returned. It still can be used for rules matching. 2929 "likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be 2930 # altered by a detection rule if the finding meets the criteria specified by 2931 # the rule. Defaults to `VERY_LIKELY` if not specified. 2932 }, 2933 ], 2934 "includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is 2935 # included in the response; see Finding.quote. 2936 "ruleSet": [ # Set of rules to apply to the findings for this InspectConfig. 2937 # Exclusion rules, contained in the set are executed in the end, other 2938 # rules are executed in the order they are specified for each info type. 2939 { # Rule set for modifying a set of infoTypes to alter behavior under certain 2940 # circumstances, depending on the specific details of the rules within the set. 2941 "rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order. 2942 { # A single inspection rule to be applied to infoTypes, specified in 2943 # `InspectionRuleSet`. 2944 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. 2945 # proximity of hotwords. 2946 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. 2947 # The total length of the window cannot exceed 1000 characters. Note that 2948 # the finding itself will be included in the window, so that hotwords may 2949 # be used to match substrings of the finding itself. For example, the 2950 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be 2951 # adjusted upwards if the area code is known to be the local area code of 2952 # a company office using the hotword regex "\(xxx\)", where "xxx" 2953 # is the area code in question. 2954 # rule. 2955 "windowAfter": 42, # Number of characters after the finding to consider. 2956 "windowBefore": 42, # Number of characters before the finding to consider. 2957 }, 2958 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. 2959 "pattern": "A String", # Pattern defining the regular expression. Its syntax 2960 # (https://github.com/google/re2/wiki/Syntax) can be found under the 2961 # google/re2 repository on GitHub. 2962 "groupIndexes": [ # The index of the submatch to extract as findings. When not 2963 # specified, the entire match is returned. No more than 3 may be included. 2964 42, 2965 ], 2966 }, 2967 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. 2968 # part of a detection rule. 2969 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of 2970 # levels. For example, if a finding would be `POSSIBLE` without the 2971 # detection rule and `relative_likelihood` is 1, then it is upgraded to 2972 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. 2973 # Likelihood may never drop below `VERY_UNLIKELY` or exceed 2974 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an 2975 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in 2976 # a final likelihood of `LIKELY`. 2977 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. 2978 }, 2979 }, 2980 "exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule. 2981 # `InspectionRuleSet` are removed from results. 2982 "regex": { # Message defining a custom regular expression. # Regular expression which defines the rule. 2983 "pattern": "A String", # Pattern defining the regular expression. Its syntax 2984 # (https://github.com/google/re2/wiki/Syntax) can be found under the 2985 # google/re2 repository on GitHub. 2986 "groupIndexes": [ # The index of the submatch to extract as findings. When not 2987 # specified, the entire match is returned. No more than 3 may be included. 2988 42, 2989 ], 2990 }, 2991 "excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule. 2992 "infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or 2993 # contained within with a finding of an infoType from this list. For 2994 # example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and 2995 # `exclusion_rule` containing `exclude_info_types.info_types` with 2996 # "EMAIL_ADDRESS" the phone number findings are dropped if they overlap 2997 # with EMAIL_ADDRESS finding. 2998 # That leads to "555-222-2222@example.org" to generate only a single 2999 # finding, namely email address. 3000 { # Type of information detected by the API. 3001 "name": "A String", # Name of the information type. Either a name of your choosing when 3002 # creating a CustomInfoType, or one of the names listed 3003 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 3004 # a built-in type. InfoType names should conform to the pattern 3005 # [a-zA-Z0-9_]{1,64}. 3006 }, 3007 ], 3008 }, 3009 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule. 3010 # be used to match sensitive information specific to the data, such as a list 3011 # of employee IDs or job titles. 3012 # 3013 # Dictionary words are case-insensitive and all characters other than letters 3014 # and digits in the unicode [Basic Multilingual 3015 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) 3016 # will be replaced with whitespace when scanning for matches, so the 3017 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", 3018 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters 3019 # surrounding any match must be of a different type than the adjacent 3020 # characters within the word, so letters must be next to non-letters and 3021 # digits next to non-digits. For example, the dictionary word "jen" will 3022 # match the first three letters of the text "jen123" but will return no 3023 # matches for "jennifer". 3024 # 3025 # Dictionary words containing a large number of characters that are not 3026 # letters or digits may result in unexpected findings because such characters 3027 # are treated as whitespace. The 3028 # [limits](https://cloud.google.com/dlp/limits) page contains details about 3029 # the size limits of dictionaries. For dictionaries that do not fit within 3030 # these constraints, consider using `LargeCustomDictionaryConfig` in the 3031 # `StoredInfoType` API. 3032 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. 3033 "words": [ # Words or phrases defining the dictionary. The dictionary must contain 3034 # at least one phrase and every phrase must contain at least 2 characters 3035 # that are letters or digits. [required] 3036 "A String", 3037 ], 3038 }, 3039 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file 3040 # is accepted. 3041 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. 3042 # Example: gs://[BUCKET_NAME]/dictionary.txt 3043 }, 3044 }, 3045 "matchingType": "A String", # How the rule is applied, see MatchingType documentation for details. 3046 }, 3047 }, 3048 ], 3049 "infoTypes": [ # List of infoTypes this rule set is applied to. 3050 { # Type of information detected by the API. 3051 "name": "A String", # Name of the information type. Either a name of your choosing when 3052 # creating a CustomInfoType, or one of the names listed 3053 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 3054 # a built-in type. InfoType names should conform to the pattern 3055 # [a-zA-Z0-9_]{1,64}. 3056 }, 3057 ], 3058 }, 3059 ], 3060 "contentOptions": [ # List of options defining data content to scan. 3061 # If empty, text, images, and other content will be included. 3062 "A String", 3063 ], 3064 "infoTypes": [ # Restricts what info_types to look for. The values must correspond to 3065 # InfoType values returned by ListInfoTypes or listed at 3066 # https://cloud.google.com/dlp/docs/infotypes-reference. 3067 # 3068 # When no InfoTypes or CustomInfoTypes are specified in a request, the 3069 # system may automatically choose what detectors to run. By default this may 3070 # be all types, but may change over time as detectors are updated. 3071 # 3072 # The special InfoType name "ALL_BASIC" can be used to trigger all detectors, 3073 # but may change over time as new InfoTypes are added. If you need precise 3074 # control and predictability as to what detectors are run you should specify 3075 # specific InfoTypes listed in the reference. 3076 { # Type of information detected by the API. 3077 "name": "A String", # Name of the information type. Either a name of your choosing when 3078 # creating a CustomInfoType, or one of the names listed 3079 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 3080 # a built-in type. InfoType names should conform to the pattern 3081 # [a-zA-Z0-9_]{1,64}. 3082 }, 3083 ], 3084 }, 3085 "createTime": "A String", # The creation timestamp of a inspectTemplate, output only field. 3086 "name": "A String", # The template name. Output only. 3087 # 3088 # The template will have one of the following formats: 3089 # `projects/PROJECT_ID/inspectTemplates/TEMPLATE_ID` OR 3090 # `organizations/ORGANIZATION_ID/inspectTemplates/TEMPLATE_ID` 3091 }, 3092 "jobConfig": { 3093 "storageConfig": { # Shared message indicating Cloud storage type. # The data to scan. 3094 "datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options specification. 3095 "partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always 3096 # by project and namespace, however the namespace ID may be empty. 3097 # A partition ID identifies a grouping of entities. The grouping is always 3098 # by project and namespace, however the namespace ID may be empty. 3099 # 3100 # A partition ID contains several dimensions: 3101 # project ID and namespace ID. 3102 "projectId": "A String", # The ID of the project to which the entities belong. 3103 "namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong. 3104 }, 3105 "kind": { # A representation of a Datastore kind. # The kind to process. 3106 "name": "A String", # The name of the kind. 3107 }, 3108 }, 3109 "bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options specification. 3110 "excludedFields": [ # References to fields excluded from scanning. This allows you to skip 3111 # inspection of entire columns which you know have no findings. 3112 { # General identifier of a data field in a storage service. 3113 "name": "A String", # Name describing the field. 3114 }, 3115 ], 3116 "rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the 3117 # rest of the rows are omitted. If not set, or if set to 0, all rows will be 3118 # scanned. Only one of rows_limit and rows_limit_percent can be specified. 3119 # Cannot be used in conjunction with TimespanConfig. 3120 "sampleMethod": "A String", 3121 "identifyingFields": [ # References to fields uniquely identifying rows within the table. 3122 # Nested fields in the format, like `person.birthdate.year`, are allowed. 3123 { # General identifier of a data field in a storage service. 3124 "name": "A String", # Name describing the field. 3125 }, 3126 ], 3127 "rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows 3128 # scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and 3129 # 100 means no limit. Defaults to 0. Only one of rows_limit and 3130 # rows_limit_percent can be specified. Cannot be used in conjunction with 3131 # TimespanConfig. 3132 "tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference. 3133 # identified by its project_id, dataset_id, and table_name. Within a query 3134 # a table is often referenced with a string in the format of: 3135 # `<project_id>:<dataset_id>.<table_id>` or 3136 # `<project_id>.<dataset_id>.<table_id>`. 3137 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. 3138 # If omitted, project ID is inferred from the API call. 3139 "tableId": "A String", # Name of the table. 3140 "datasetId": "A String", # Dataset ID of the table. 3141 }, 3142 }, 3143 "timespanConfig": { # Configuration of the timespan of the items to include in scanning. 3144 # Currently only supported when inspecting Google Cloud Storage and BigQuery. 3145 "timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items. 3146 # Used for data sources like Datastore or BigQuery. 3147 # If not specified for BigQuery, table last modification timestamp 3148 # is checked against given time span. 3149 # The valid data types of the timestamp field are: 3150 # for BigQuery - timestamp, date, datetime; 3151 # for Datastore - timestamp. 3152 # Datastore entity will be scanned if the timestamp property does not exist 3153 # or its value is empty or invalid. 3154 "name": "A String", # Name describing the field. 3155 }, 3156 "endTime": "A String", # Exclude files or rows newer than this value. 3157 # If set to zero, no upper time limit is applied. 3158 "startTime": "A String", # Exclude files or rows older than this value. 3159 "enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out 3160 # a valid start_time to avoid scanning files that have not been modified 3161 # since the last time the JobTrigger executed. This will be based on the 3162 # time of the execution of the last run of the JobTrigger. 3163 }, 3164 "cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options specification. 3165 # bucket. 3166 "bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger 3167 # than this value then the rest of the bytes are omitted. Only one 3168 # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified. 3169 "sampleMethod": "A String", 3170 "fileSet": { # Set of files to scan. # The set of one or more files to scan. 3171 "url": "A String", # The Cloud Storage url of the file(s) to scan, in the format 3172 # `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed. 3173 # 3174 # If the url ends in a trailing slash, the bucket or directory represented 3175 # by the url will be scanned non-recursively (content in sub-directories 3176 # will not be scanned). This means that `gs://mybucket/` is equivalent to 3177 # `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to 3178 # `gs://mybucket/directory/*`. 3179 # 3180 # Exactly one of `url` or `regex_file_set` must be set. 3181 "regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or 3182 # `regex_file_set` must be set. 3183 # expressions are used to allow fine-grained control over which files in the 3184 # bucket to include. 3185 # 3186 # Included files are those that match at least one item in `include_regex` and 3187 # do not match any items in `exclude_regex`. Note that a file that matches 3188 # items from both lists will _not_ be included. For a match to occur, the 3189 # entire file path (i.e., everything in the url after the bucket name) must 3190 # match the regular expression. 3191 # 3192 # For example, given the input `{bucket_name: "mybucket", include_regex: 3193 # ["directory1/.*"], exclude_regex: 3194 # ["directory1/excluded.*"]}`: 3195 # 3196 # * `gs://mybucket/directory1/myfile` will be included 3197 # * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches 3198 # across `/`) 3199 # * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the 3200 # full path doesn't match any items in `include_regex`) 3201 # * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path 3202 # matches an item in `exclude_regex`) 3203 # 3204 # If `include_regex` is left empty, it will match all files by default 3205 # (this is equivalent to setting `include_regex: [".*"]`). 3206 # 3207 # Some other common use cases: 3208 # 3209 # * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all 3210 # files in `mybucket` except for .pdf files 3211 # * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will 3212 # include all files directly under `gs://mybucket/directory/`, without matching 3213 # across `/` 3214 "excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in 3215 # the bucket that match at least one of these regular expressions will be 3216 # excluded from the scan. 3217 # 3218 # Regular expressions use RE2 3219 # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found 3220 # under the google/re2 repository on GitHub. 3221 "A String", 3222 ], 3223 "bucketName": "A String", # The name of a Cloud Storage bucket. Required. 3224 "includeRegex": [ # A list of regular expressions matching file paths to include. All files in 3225 # the bucket that match at least one of these regular expressions will be 3226 # included in the set of files, except for those that also match an item in 3227 # `exclude_regex`. Leaving this field empty will match all files by default 3228 # (this is equivalent to including `.*` in the list). 3229 # 3230 # Regular expressions use RE2 3231 # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found 3232 # under the google/re2 repository on GitHub. 3233 "A String", 3234 ], 3235 }, 3236 }, 3237 "bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The 3238 # number of bytes scanned is rounded down. Must be between 0 and 100, 3239 # inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one 3240 # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified. 3241 "filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet. 3242 # Number of files scanned is rounded down. Must be between 0 and 100, 3243 # inclusively. Both 0 and 100 means no limit. Defaults to 0. 3244 "fileTypes": [ # List of file type groups to include in the scan. 3245 # If empty, all files are scanned and available data format processors 3246 # are applied. In addition, the binary content of the selected files 3247 # is always scanned as well. 3248 "A String", 3249 ], 3250 }, 3251 }, 3252 "inspectConfig": { # Configuration description of the scanning process. # How and what to scan for. 3253 # When used with redactContent only info_types and min_likelihood are currently 3254 # used. 3255 "excludeInfoTypes": True or False, # When true, excludes type information of the findings. 3256 "limits": { 3257 "maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job. 3258 # When set within `InspectContentRequest`, the maximum returned is 2000 3259 # regardless if this is set higher. 3260 "maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes. 3261 { # Max findings configuration per infoType, per content item or long 3262 # running DlpJob. 3263 "infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per 3264 # info_type should be provided. If InfoTypeLimit does not have an 3265 # info_type, the DLP API applies the limit against all info_types that 3266 # are found but not specified in another InfoTypeLimit. 3267 "name": "A String", # Name of the information type. Either a name of your choosing when 3268 # creating a CustomInfoType, or one of the names listed 3269 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 3270 # a built-in type. InfoType names should conform to the pattern 3271 # [a-zA-Z0-9_]{1,64}. 3272 }, 3273 "maxFindings": 42, # Max findings limit for the given infoType. 3274 }, 3275 ], 3276 "maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned. 3277 # When set within `InspectDataSourceRequest`, 3278 # the maximum returned is 2000 regardless if this is set higher. 3279 # When set within `InspectContentRequest`, this field is ignored. 3280 }, 3281 "minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is 3282 # POSSIBLE. 3283 # See https://cloud.google.com/dlp/docs/likelihood to learn more. 3284 "customInfoTypes": [ # CustomInfoTypes provided by the user. See 3285 # https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more. 3286 { # Custom information type provided by the user. Used to find domain-specific 3287 # sensitive information configurable to the data in question. 3288 "regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType. 3289 "pattern": "A String", # Pattern defining the regular expression. Its syntax 3290 # (https://github.com/google/re2/wiki/Syntax) can be found under the 3291 # google/re2 repository on GitHub. 3292 "groupIndexes": [ # The index of the submatch to extract as findings. When not 3293 # specified, the entire match is returned. No more than 3 may be included. 3294 42, 3295 ], 3296 }, 3297 "surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that 3298 # support reversing. 3299 # such as 3300 # [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig). 3301 # These types of transformations are 3302 # those that perform pseudonymization, thereby producing a "surrogate" as 3303 # output. This should be used in conjunction with a field on the 3304 # transformation such as `surrogate_info_type`. This CustomInfoType does 3305 # not support the use of `detection_rules`. 3306 }, 3307 "infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in 3308 # infoType, when the name matches one of existing infoTypes and that infoType 3309 # is specified in `InspectContent.info_types` field. Specifying the latter 3310 # adds findings to the one detected by the system. If built-in info type is 3311 # not specified in `InspectContent.info_types` list then the name is treated 3312 # as a custom info type. 3313 "name": "A String", # Name of the information type. Either a name of your choosing when 3314 # creating a CustomInfoType, or one of the names listed 3315 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 3316 # a built-in type. InfoType names should conform to the pattern 3317 # [a-zA-Z0-9_]{1,64}. 3318 }, 3319 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType. 3320 # be used to match sensitive information specific to the data, such as a list 3321 # of employee IDs or job titles. 3322 # 3323 # Dictionary words are case-insensitive and all characters other than letters 3324 # and digits in the unicode [Basic Multilingual 3325 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) 3326 # will be replaced with whitespace when scanning for matches, so the 3327 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", 3328 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters 3329 # surrounding any match must be of a different type than the adjacent 3330 # characters within the word, so letters must be next to non-letters and 3331 # digits next to non-digits. For example, the dictionary word "jen" will 3332 # match the first three letters of the text "jen123" but will return no 3333 # matches for "jennifer". 3334 # 3335 # Dictionary words containing a large number of characters that are not 3336 # letters or digits may result in unexpected findings because such characters 3337 # are treated as whitespace. The 3338 # [limits](https://cloud.google.com/dlp/limits) page contains details about 3339 # the size limits of dictionaries. For dictionaries that do not fit within 3340 # these constraints, consider using `LargeCustomDictionaryConfig` in the 3341 # `StoredInfoType` API. 3342 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. 3343 "words": [ # Words or phrases defining the dictionary. The dictionary must contain 3344 # at least one phrase and every phrase must contain at least 2 characters 3345 # that are letters or digits. [required] 3346 "A String", 3347 ], 3348 }, 3349 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file 3350 # is accepted. 3351 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. 3352 # Example: gs://[BUCKET_NAME]/dictionary.txt 3353 }, 3354 }, 3355 "storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in 3356 # `InspectDataSource`. Not currently supported in `InspectContent`. 3357 "name": "A String", # Resource name of the requested `StoredInfoType`, for example 3358 # `organizations/433245324/storedInfoTypes/432452342` or 3359 # `projects/project-id/storedInfoTypes/432452342`. 3360 "createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for 3361 # inspection was created. Output-only field, populated by the system. 3362 }, 3363 "detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType. 3364 # Rules are applied in order that they are specified. Not supported for the 3365 # `surrogate_type` CustomInfoType. 3366 { # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a 3367 # `CustomInfoType` to alter behavior under certain circumstances, depending 3368 # on the specific details of the rule. Not supported for the `surrogate_type` 3369 # custom infoType. 3370 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. 3371 # proximity of hotwords. 3372 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. 3373 # The total length of the window cannot exceed 1000 characters. Note that 3374 # the finding itself will be included in the window, so that hotwords may 3375 # be used to match substrings of the finding itself. For example, the 3376 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be 3377 # adjusted upwards if the area code is known to be the local area code of 3378 # a company office using the hotword regex "\(xxx\)", where "xxx" 3379 # is the area code in question. 3380 # rule. 3381 "windowAfter": 42, # Number of characters after the finding to consider. 3382 "windowBefore": 42, # Number of characters before the finding to consider. 3383 }, 3384 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. 3385 "pattern": "A String", # Pattern defining the regular expression. Its syntax 3386 # (https://github.com/google/re2/wiki/Syntax) can be found under the 3387 # google/re2 repository on GitHub. 3388 "groupIndexes": [ # The index of the submatch to extract as findings. When not 3389 # specified, the entire match is returned. No more than 3 may be included. 3390 42, 3391 ], 3392 }, 3393 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. 3394 # part of a detection rule. 3395 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of 3396 # levels. For example, if a finding would be `POSSIBLE` without the 3397 # detection rule and `relative_likelihood` is 1, then it is upgraded to 3398 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. 3399 # Likelihood may never drop below `VERY_UNLIKELY` or exceed 3400 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an 3401 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in 3402 # a final likelihood of `LIKELY`. 3403 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. 3404 }, 3405 }, 3406 }, 3407 ], 3408 "exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding 3409 # to be returned. It still can be used for rules matching. 3410 "likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be 3411 # altered by a detection rule if the finding meets the criteria specified by 3412 # the rule. Defaults to `VERY_LIKELY` if not specified. 3413 }, 3414 ], 3415 "includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is 3416 # included in the response; see Finding.quote. 3417 "ruleSet": [ # Set of rules to apply to the findings for this InspectConfig. 3418 # Exclusion rules, contained in the set are executed in the end, other 3419 # rules are executed in the order they are specified for each info type. 3420 { # Rule set for modifying a set of infoTypes to alter behavior under certain 3421 # circumstances, depending on the specific details of the rules within the set. 3422 "rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order. 3423 { # A single inspection rule to be applied to infoTypes, specified in 3424 # `InspectionRuleSet`. 3425 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. 3426 # proximity of hotwords. 3427 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. 3428 # The total length of the window cannot exceed 1000 characters. Note that 3429 # the finding itself will be included in the window, so that hotwords may 3430 # be used to match substrings of the finding itself. For example, the 3431 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be 3432 # adjusted upwards if the area code is known to be the local area code of 3433 # a company office using the hotword regex "\(xxx\)", where "xxx" 3434 # is the area code in question. 3435 # rule. 3436 "windowAfter": 42, # Number of characters after the finding to consider. 3437 "windowBefore": 42, # Number of characters before the finding to consider. 3438 }, 3439 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. 3440 "pattern": "A String", # Pattern defining the regular expression. Its syntax 3441 # (https://github.com/google/re2/wiki/Syntax) can be found under the 3442 # google/re2 repository on GitHub. 3443 "groupIndexes": [ # The index of the submatch to extract as findings. When not 3444 # specified, the entire match is returned. No more than 3 may be included. 3445 42, 3446 ], 3447 }, 3448 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. 3449 # part of a detection rule. 3450 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of 3451 # levels. For example, if a finding would be `POSSIBLE` without the 3452 # detection rule and `relative_likelihood` is 1, then it is upgraded to 3453 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. 3454 # Likelihood may never drop below `VERY_UNLIKELY` or exceed 3455 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an 3456 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in 3457 # a final likelihood of `LIKELY`. 3458 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. 3459 }, 3460 }, 3461 "exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule. 3462 # `InspectionRuleSet` are removed from results. 3463 "regex": { # Message defining a custom regular expression. # Regular expression which defines the rule. 3464 "pattern": "A String", # Pattern defining the regular expression. Its syntax 3465 # (https://github.com/google/re2/wiki/Syntax) can be found under the 3466 # google/re2 repository on GitHub. 3467 "groupIndexes": [ # The index of the submatch to extract as findings. When not 3468 # specified, the entire match is returned. No more than 3 may be included. 3469 42, 3470 ], 3471 }, 3472 "excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule. 3473 "infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or 3474 # contained within with a finding of an infoType from this list. For 3475 # example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and 3476 # `exclusion_rule` containing `exclude_info_types.info_types` with 3477 # "EMAIL_ADDRESS" the phone number findings are dropped if they overlap 3478 # with EMAIL_ADDRESS finding. 3479 # That leads to "555-222-2222@example.org" to generate only a single 3480 # finding, namely email address. 3481 { # Type of information detected by the API. 3482 "name": "A String", # Name of the information type. Either a name of your choosing when 3483 # creating a CustomInfoType, or one of the names listed 3484 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 3485 # a built-in type. InfoType names should conform to the pattern 3486 # [a-zA-Z0-9_]{1,64}. 3487 }, 3488 ], 3489 }, 3490 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule. 3491 # be used to match sensitive information specific to the data, such as a list 3492 # of employee IDs or job titles. 3493 # 3494 # Dictionary words are case-insensitive and all characters other than letters 3495 # and digits in the unicode [Basic Multilingual 3496 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) 3497 # will be replaced with whitespace when scanning for matches, so the 3498 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", 3499 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters 3500 # surrounding any match must be of a different type than the adjacent 3501 # characters within the word, so letters must be next to non-letters and 3502 # digits next to non-digits. For example, the dictionary word "jen" will 3503 # match the first three letters of the text "jen123" but will return no 3504 # matches for "jennifer". 3505 # 3506 # Dictionary words containing a large number of characters that are not 3507 # letters or digits may result in unexpected findings because such characters 3508 # are treated as whitespace. The 3509 # [limits](https://cloud.google.com/dlp/limits) page contains details about 3510 # the size limits of dictionaries. For dictionaries that do not fit within 3511 # these constraints, consider using `LargeCustomDictionaryConfig` in the 3512 # `StoredInfoType` API. 3513 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. 3514 "words": [ # Words or phrases defining the dictionary. The dictionary must contain 3515 # at least one phrase and every phrase must contain at least 2 characters 3516 # that are letters or digits. [required] 3517 "A String", 3518 ], 3519 }, 3520 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file 3521 # is accepted. 3522 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. 3523 # Example: gs://[BUCKET_NAME]/dictionary.txt 3524 }, 3525 }, 3526 "matchingType": "A String", # How the rule is applied, see MatchingType documentation for details. 3527 }, 3528 }, 3529 ], 3530 "infoTypes": [ # List of infoTypes this rule set is applied to. 3531 { # Type of information detected by the API. 3532 "name": "A String", # Name of the information type. Either a name of your choosing when 3533 # creating a CustomInfoType, or one of the names listed 3534 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 3535 # a built-in type. InfoType names should conform to the pattern 3536 # [a-zA-Z0-9_]{1,64}. 3537 }, 3538 ], 3539 }, 3540 ], 3541 "contentOptions": [ # List of options defining data content to scan. 3542 # If empty, text, images, and other content will be included. 3543 "A String", 3544 ], 3545 "infoTypes": [ # Restricts what info_types to look for. The values must correspond to 3546 # InfoType values returned by ListInfoTypes or listed at 3547 # https://cloud.google.com/dlp/docs/infotypes-reference. 3548 # 3549 # When no InfoTypes or CustomInfoTypes are specified in a request, the 3550 # system may automatically choose what detectors to run. By default this may 3551 # be all types, but may change over time as detectors are updated. 3552 # 3553 # The special InfoType name "ALL_BASIC" can be used to trigger all detectors, 3554 # but may change over time as new InfoTypes are added. If you need precise 3555 # control and predictability as to what detectors are run you should specify 3556 # specific InfoTypes listed in the reference. 3557 { # Type of information detected by the API. 3558 "name": "A String", # Name of the information type. Either a name of your choosing when 3559 # creating a CustomInfoType, or one of the names listed 3560 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 3561 # a built-in type. InfoType names should conform to the pattern 3562 # [a-zA-Z0-9_]{1,64}. 3563 }, 3564 ], 3565 }, 3566 "inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig. 3567 # `inspect_config` will be merged into the values persisted as part of the 3568 # template. 3569 "actions": [ # Actions to execute at the completion of the job. 3570 { # A task to execute on the completion of a job. 3571 # See https://cloud.google.com/dlp/docs/concepts-actions to learn more. 3572 "saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location. 3573 # OutputStorageConfig. Only a single instance of this action can be 3574 # specified. 3575 # Compatible with: Inspect, Risk 3576 "outputConfig": { # Cloud repository for storing output. 3577 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing 3578 # dataset. If table_id is not set a new one will be generated 3579 # for you with the following format: 3580 # dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for 3581 # generating the date details. 3582 # 3583 # For Inspect, each column in an existing output table must have the same 3584 # name, type, and mode of a field in the `Finding` object. 3585 # 3586 # For Risk, an existing output table should be the output of a previous 3587 # Risk analysis job run on the same source table, with the same privacy 3588 # metric and quasi-identifiers. Risk jobs that analyze the same table but 3589 # compute a different privacy metric, or use different sets of 3590 # quasi-identifiers, cannot store their results in the same table. 3591 # identified by its project_id, dataset_id, and table_name. Within a query 3592 # a table is often referenced with a string in the format of: 3593 # `<project_id>:<dataset_id>.<table_id>` or 3594 # `<project_id>.<dataset_id>.<table_id>`. 3595 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. 3596 # If omitted, project ID is inferred from the API call. 3597 "tableId": "A String", # Name of the table. 3598 "datasetId": "A String", # Dataset ID of the table. 3599 }, 3600 "outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only 3601 # used for Inspect and must be unspecified for Risk jobs. Columns are derived 3602 # from the `Finding` object. If appending to an existing table, any columns 3603 # from the predefined schema that are missing will be added. No columns in 3604 # the existing table will be deleted. 3605 # 3606 # If unspecified, then all available columns will be used for a new table or 3607 # an (existing) table with no schema, and no changes will be made to an 3608 # existing table that has a schema. 3609 }, 3610 }, 3611 "jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification to project owners and editors on job's 3612 # completion/failure. 3613 # completion/failure. 3614 }, 3615 "publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha). 3616 # Command Center (CSCC Alpha). 3617 # This action is only available for projects which are parts of 3618 # an organization and whitelisted for the alpha Cloud Security Command 3619 # Center. 3620 # The action will publish count of finding instances and their info types. 3621 # The summary of findings will be persisted in CSCC and are governed by CSCC 3622 # service-specific policy, see https://cloud.google.com/terms/service-terms 3623 # Only a single instance of this action can be specified. 3624 # Compatible with: Inspect 3625 }, 3626 "pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic. 3627 # message contains a single field, `DlpJobName`, which is equal to the 3628 # finished job's 3629 # [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob). 3630 # Compatible with: Inspect, Risk 3631 "topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given 3632 # publishing access rights to the DLP API service account executing 3633 # the long running DlpJob sending the notifications. 3634 # Format is projects/{project}/topics/{topic}. 3635 }, 3636 }, 3637 ], 3638 }, 3639 }, 3640 "result": { # All result fields mentioned below are updated while the job is processing. # A summary of the outcome of this inspect job. 3641 "infoTypeStats": [ # Statistics of how many instances of each info type were found during 3642 # inspect job. 3643 { # Statistics regarding a specific InfoType. 3644 "count": "A String", # Number of findings for this infoType. 3645 "infoType": { # Type of information detected by the API. # The type of finding this stat is for. 3646 "name": "A String", # Name of the information type. Either a name of your choosing when 3647 # creating a CustomInfoType, or one of the names listed 3648 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 3649 # a built-in type. InfoType names should conform to the pattern 3650 # [a-zA-Z0-9_]{1,64}. 3651 }, 3652 }, 3653 ], 3654 "totalEstimatedBytes": "A String", # Estimate of the number of bytes to process. 3655 "processedBytes": "A String", # Total size in bytes that were processed. 3656 }, 3657 }, 3658 "riskDetails": { # Result of a risk analysis operation request. # Results from analyzing risk of a data source. 3659 "numericalStatsResult": { # Result of the numerical stats computation. 3660 "quantileValues": [ # List of 99 values that partition the set of field values into 100 equal 3661 # sized buckets. 3662 { # Set of primitive values supported by the system. 3663 # Note that for the purposes of inspection or transformation, the number 3664 # of bytes considered to comprise a 'Value' is based on its representation 3665 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 3666 # 123456789, the number of bytes would be counted as 9, even though an 3667 # int64 only holds up to 8 bytes of data. 3668 "floatValue": 3.14, 3669 "timestampValue": "A String", 3670 "dayOfWeekValue": "A String", 3671 "timeValue": { # Represents a time of day. The date and time zone are either not significant 3672 # or are specified elsewhere. An API may choose to allow leap seconds. Related 3673 # types are google.type.Date and `google.protobuf.Timestamp`. 3674 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 3675 # to allow the value "24:00:00" for scenarios like business closing time. 3676 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 3677 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 3678 # allow the value 60 if it allows leap-seconds. 3679 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 3680 }, 3681 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 3682 # and time zone are either specified elsewhere or are not significant. The date 3683 # is relative to the Proleptic Gregorian Calendar. This can represent: 3684 # 3685 # * A full date, with non-zero year, month and day values 3686 # * A month and day value, with a zero year, e.g. an anniversary 3687 # * A year on its own, with zero month and day values 3688 # * A year and month value, with a zero day, e.g. a credit card expiration date 3689 # 3690 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 3691 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 3692 # a year. 3693 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 3694 # if specifying a year by itself or a year and month where the day is not 3695 # significant. 3696 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 3697 # month and day. 3698 }, 3699 "stringValue": "A String", 3700 "booleanValue": True or False, 3701 "integerValue": "A String", 3702 }, 3703 ], 3704 "maxValue": { # Set of primitive values supported by the system. # Maximum value appearing in the column. 3705 # Note that for the purposes of inspection or transformation, the number 3706 # of bytes considered to comprise a 'Value' is based on its representation 3707 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 3708 # 123456789, the number of bytes would be counted as 9, even though an 3709 # int64 only holds up to 8 bytes of data. 3710 "floatValue": 3.14, 3711 "timestampValue": "A String", 3712 "dayOfWeekValue": "A String", 3713 "timeValue": { # Represents a time of day. The date and time zone are either not significant 3714 # or are specified elsewhere. An API may choose to allow leap seconds. Related 3715 # types are google.type.Date and `google.protobuf.Timestamp`. 3716 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 3717 # to allow the value "24:00:00" for scenarios like business closing time. 3718 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 3719 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 3720 # allow the value 60 if it allows leap-seconds. 3721 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 3722 }, 3723 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 3724 # and time zone are either specified elsewhere or are not significant. The date 3725 # is relative to the Proleptic Gregorian Calendar. This can represent: 3726 # 3727 # * A full date, with non-zero year, month and day values 3728 # * A month and day value, with a zero year, e.g. an anniversary 3729 # * A year on its own, with zero month and day values 3730 # * A year and month value, with a zero day, e.g. a credit card expiration date 3731 # 3732 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 3733 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 3734 # a year. 3735 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 3736 # if specifying a year by itself or a year and month where the day is not 3737 # significant. 3738 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 3739 # month and day. 3740 }, 3741 "stringValue": "A String", 3742 "booleanValue": True or False, 3743 "integerValue": "A String", 3744 }, 3745 "minValue": { # Set of primitive values supported by the system. # Minimum value appearing in the column. 3746 # Note that for the purposes of inspection or transformation, the number 3747 # of bytes considered to comprise a 'Value' is based on its representation 3748 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 3749 # 123456789, the number of bytes would be counted as 9, even though an 3750 # int64 only holds up to 8 bytes of data. 3751 "floatValue": 3.14, 3752 "timestampValue": "A String", 3753 "dayOfWeekValue": "A String", 3754 "timeValue": { # Represents a time of day. The date and time zone are either not significant 3755 # or are specified elsewhere. An API may choose to allow leap seconds. Related 3756 # types are google.type.Date and `google.protobuf.Timestamp`. 3757 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 3758 # to allow the value "24:00:00" for scenarios like business closing time. 3759 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 3760 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 3761 # allow the value 60 if it allows leap-seconds. 3762 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 3763 }, 3764 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 3765 # and time zone are either specified elsewhere or are not significant. The date 3766 # is relative to the Proleptic Gregorian Calendar. This can represent: 3767 # 3768 # * A full date, with non-zero year, month and day values 3769 # * A month and day value, with a zero year, e.g. an anniversary 3770 # * A year on its own, with zero month and day values 3771 # * A year and month value, with a zero day, e.g. a credit card expiration date 3772 # 3773 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 3774 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 3775 # a year. 3776 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 3777 # if specifying a year by itself or a year and month where the day is not 3778 # significant. 3779 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 3780 # month and day. 3781 }, 3782 "stringValue": "A String", 3783 "booleanValue": True or False, 3784 "integerValue": "A String", 3785 }, 3786 }, 3787 "kMapEstimationResult": { # Result of the reidentifiability analysis. Note that these results are an 3788 # estimation, not exact values. 3789 "kMapEstimationHistogram": [ # The intervals [min_anonymity, max_anonymity] do not overlap. If a value 3790 # doesn't correspond to any such interval, the associated frequency is 3791 # zero. For example, the following records: 3792 # {min_anonymity: 1, max_anonymity: 1, frequency: 17} 3793 # {min_anonymity: 2, max_anonymity: 3, frequency: 42} 3794 # {min_anonymity: 5, max_anonymity: 10, frequency: 99} 3795 # mean that there are no record with an estimated anonymity of 4, 5, or 3796 # larger than 10. 3797 { # A KMapEstimationHistogramBucket message with the following values: 3798 # min_anonymity: 3 3799 # max_anonymity: 5 3800 # frequency: 42 3801 # means that there are 42 records whose quasi-identifier values correspond 3802 # to 3, 4 or 5 people in the overlying population. An important particular 3803 # case is when min_anonymity = max_anonymity = 1: the frequency field then 3804 # corresponds to the number of uniquely identifiable records. 3805 "bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total 3806 # number of classes returned per bucket is capped at 20. 3807 { # A tuple of values for the quasi-identifier columns. 3808 "estimatedAnonymity": "A String", # The estimated anonymity for these quasi-identifier values. 3809 "quasiIdsValues": [ # The quasi-identifier values. 3810 { # Set of primitive values supported by the system. 3811 # Note that for the purposes of inspection or transformation, the number 3812 # of bytes considered to comprise a 'Value' is based on its representation 3813 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 3814 # 123456789, the number of bytes would be counted as 9, even though an 3815 # int64 only holds up to 8 bytes of data. 3816 "floatValue": 3.14, 3817 "timestampValue": "A String", 3818 "dayOfWeekValue": "A String", 3819 "timeValue": { # Represents a time of day. The date and time zone are either not significant 3820 # or are specified elsewhere. An API may choose to allow leap seconds. Related 3821 # types are google.type.Date and `google.protobuf.Timestamp`. 3822 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 3823 # to allow the value "24:00:00" for scenarios like business closing time. 3824 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 3825 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 3826 # allow the value 60 if it allows leap-seconds. 3827 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 3828 }, 3829 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 3830 # and time zone are either specified elsewhere or are not significant. The date 3831 # is relative to the Proleptic Gregorian Calendar. This can represent: 3832 # 3833 # * A full date, with non-zero year, month and day values 3834 # * A month and day value, with a zero year, e.g. an anniversary 3835 # * A year on its own, with zero month and day values 3836 # * A year and month value, with a zero day, e.g. a credit card expiration date 3837 # 3838 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 3839 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 3840 # a year. 3841 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 3842 # if specifying a year by itself or a year and month where the day is not 3843 # significant. 3844 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 3845 # month and day. 3846 }, 3847 "stringValue": "A String", 3848 "booleanValue": True or False, 3849 "integerValue": "A String", 3850 }, 3851 ], 3852 }, 3853 ], 3854 "minAnonymity": "A String", # Always positive. 3855 "bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket. 3856 "maxAnonymity": "A String", # Always greater than or equal to min_anonymity. 3857 "bucketSize": "A String", # Number of records within these anonymity bounds. 3858 }, 3859 ], 3860 }, 3861 "kAnonymityResult": { # Result of the k-anonymity computation. 3862 "equivalenceClassHistogramBuckets": [ # Histogram of k-anonymity equivalence classes. 3863 { 3864 "bucketValues": [ # Sample of equivalence classes in this bucket. The total number of 3865 # classes returned per bucket is capped at 20. 3866 { # The set of columns' values that share the same ldiversity value 3867 "quasiIdsValues": [ # Set of values defining the equivalence class. One value per 3868 # quasi-identifier column in the original KAnonymity metric message. 3869 # The order is always the same as the original request. 3870 { # Set of primitive values supported by the system. 3871 # Note that for the purposes of inspection or transformation, the number 3872 # of bytes considered to comprise a 'Value' is based on its representation 3873 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 3874 # 123456789, the number of bytes would be counted as 9, even though an 3875 # int64 only holds up to 8 bytes of data. 3876 "floatValue": 3.14, 3877 "timestampValue": "A String", 3878 "dayOfWeekValue": "A String", 3879 "timeValue": { # Represents a time of day. The date and time zone are either not significant 3880 # or are specified elsewhere. An API may choose to allow leap seconds. Related 3881 # types are google.type.Date and `google.protobuf.Timestamp`. 3882 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 3883 # to allow the value "24:00:00" for scenarios like business closing time. 3884 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 3885 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 3886 # allow the value 60 if it allows leap-seconds. 3887 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 3888 }, 3889 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 3890 # and time zone are either specified elsewhere or are not significant. The date 3891 # is relative to the Proleptic Gregorian Calendar. This can represent: 3892 # 3893 # * A full date, with non-zero year, month and day values 3894 # * A month and day value, with a zero year, e.g. an anniversary 3895 # * A year on its own, with zero month and day values 3896 # * A year and month value, with a zero day, e.g. a credit card expiration date 3897 # 3898 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 3899 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 3900 # a year. 3901 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 3902 # if specifying a year by itself or a year and month where the day is not 3903 # significant. 3904 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 3905 # month and day. 3906 }, 3907 "stringValue": "A String", 3908 "booleanValue": True or False, 3909 "integerValue": "A String", 3910 }, 3911 ], 3912 "equivalenceClassSize": "A String", # Size of the equivalence class, for example number of rows with the 3913 # above set of values. 3914 }, 3915 ], 3916 "bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket. 3917 "equivalenceClassSizeLowerBound": "A String", # Lower bound on the size of the equivalence classes in this bucket. 3918 "equivalenceClassSizeUpperBound": "A String", # Upper bound on the size of the equivalence classes in this bucket. 3919 "bucketSize": "A String", # Total number of equivalence classes in this bucket. 3920 }, 3921 ], 3922 }, 3923 "lDiversityResult": { # Result of the l-diversity computation. 3924 "sensitiveValueFrequencyHistogramBuckets": [ # Histogram of l-diversity equivalence class sensitive value frequencies. 3925 { 3926 "bucketValues": [ # Sample of equivalence classes in this bucket. The total number of 3927 # classes returned per bucket is capped at 20. 3928 { # The set of columns' values that share the same ldiversity value. 3929 "numDistinctSensitiveValues": "A String", # Number of distinct sensitive values in this equivalence class. 3930 "quasiIdsValues": [ # Quasi-identifier values defining the k-anonymity equivalence 3931 # class. The order is always the same as the original request. 3932 { # Set of primitive values supported by the system. 3933 # Note that for the purposes of inspection or transformation, the number 3934 # of bytes considered to comprise a 'Value' is based on its representation 3935 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 3936 # 123456789, the number of bytes would be counted as 9, even though an 3937 # int64 only holds up to 8 bytes of data. 3938 "floatValue": 3.14, 3939 "timestampValue": "A String", 3940 "dayOfWeekValue": "A String", 3941 "timeValue": { # Represents a time of day. The date and time zone are either not significant 3942 # or are specified elsewhere. An API may choose to allow leap seconds. Related 3943 # types are google.type.Date and `google.protobuf.Timestamp`. 3944 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 3945 # to allow the value "24:00:00" for scenarios like business closing time. 3946 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 3947 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 3948 # allow the value 60 if it allows leap-seconds. 3949 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 3950 }, 3951 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 3952 # and time zone are either specified elsewhere or are not significant. The date 3953 # is relative to the Proleptic Gregorian Calendar. This can represent: 3954 # 3955 # * A full date, with non-zero year, month and day values 3956 # * A month and day value, with a zero year, e.g. an anniversary 3957 # * A year on its own, with zero month and day values 3958 # * A year and month value, with a zero day, e.g. a credit card expiration date 3959 # 3960 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 3961 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 3962 # a year. 3963 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 3964 # if specifying a year by itself or a year and month where the day is not 3965 # significant. 3966 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 3967 # month and day. 3968 }, 3969 "stringValue": "A String", 3970 "booleanValue": True or False, 3971 "integerValue": "A String", 3972 }, 3973 ], 3974 "topSensitiveValues": [ # Estimated frequencies of top sensitive values. 3975 { # A value of a field, including its frequency. 3976 "count": "A String", # How many times the value is contained in the field. 3977 "value": { # Set of primitive values supported by the system. # A value contained in the field in question. 3978 # Note that for the purposes of inspection or transformation, the number 3979 # of bytes considered to comprise a 'Value' is based on its representation 3980 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 3981 # 123456789, the number of bytes would be counted as 9, even though an 3982 # int64 only holds up to 8 bytes of data. 3983 "floatValue": 3.14, 3984 "timestampValue": "A String", 3985 "dayOfWeekValue": "A String", 3986 "timeValue": { # Represents a time of day. The date and time zone are either not significant 3987 # or are specified elsewhere. An API may choose to allow leap seconds. Related 3988 # types are google.type.Date and `google.protobuf.Timestamp`. 3989 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 3990 # to allow the value "24:00:00" for scenarios like business closing time. 3991 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 3992 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 3993 # allow the value 60 if it allows leap-seconds. 3994 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 3995 }, 3996 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 3997 # and time zone are either specified elsewhere or are not significant. The date 3998 # is relative to the Proleptic Gregorian Calendar. This can represent: 3999 # 4000 # * A full date, with non-zero year, month and day values 4001 # * A month and day value, with a zero year, e.g. an anniversary 4002 # * A year on its own, with zero month and day values 4003 # * A year and month value, with a zero day, e.g. a credit card expiration date 4004 # 4005 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 4006 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 4007 # a year. 4008 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 4009 # if specifying a year by itself or a year and month where the day is not 4010 # significant. 4011 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 4012 # month and day. 4013 }, 4014 "stringValue": "A String", 4015 "booleanValue": True or False, 4016 "integerValue": "A String", 4017 }, 4018 }, 4019 ], 4020 "equivalenceClassSize": "A String", # Size of the k-anonymity equivalence class. 4021 }, 4022 ], 4023 "bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket. 4024 "bucketSize": "A String", # Total number of equivalence classes in this bucket. 4025 "sensitiveValueFrequencyUpperBound": "A String", # Upper bound on the sensitive value frequencies of the equivalence 4026 # classes in this bucket. 4027 "sensitiveValueFrequencyLowerBound": "A String", # Lower bound on the sensitive value frequencies of the equivalence 4028 # classes in this bucket. 4029 }, 4030 ], 4031 }, 4032 "requestedPrivacyMetric": { # Privacy metric to compute for reidentification risk analysis. # Privacy metric to compute. 4033 "numericalStatsConfig": { # Compute numerical stats over an individual column, including 4034 # min, max, and quantiles. 4035 "field": { # General identifier of a data field in a storage service. # Field to compute numerical stats on. Supported types are 4036 # integer, float, date, datetime, timestamp, time. 4037 "name": "A String", # Name describing the field. 4038 }, 4039 }, 4040 "kMapEstimationConfig": { # Reidentifiability metric. This corresponds to a risk model similar to what 4041 # is called "journalist risk" in the literature, except the attack dataset is 4042 # statistically modeled instead of being perfectly known. This can be done 4043 # using publicly available data (like the US Census), or using a custom 4044 # statistical model (indicated as one or several BigQuery tables), or by 4045 # extrapolating from the distribution of values in the input dataset. 4046 # A column with a semantic tag attached. 4047 "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling. 4048 # Required if no column is tagged with a region-specific InfoType (like 4049 # US_ZIP_5) or a region code. 4050 "quasiIds": [ # Fields considered to be quasi-identifiers. No two columns can have the 4051 # same tag. [required] 4052 { 4053 "field": { # General identifier of a data field in a storage service. # Identifies the column. [required] 4054 "name": "A String", # Name describing the field. 4055 }, 4056 "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must 4057 # indicate an auxiliary table that contains statistical information on 4058 # the possible values of this column (below). 4059 "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public 4060 # dataset as a statistical model of population, if available. We 4061 # currently support US ZIP codes, region codes, ages and genders. 4062 # To programmatically obtain the list of supported InfoTypes, use 4063 # ListInfoTypes with the supported_by=RISK_ANALYSIS filter. 4064 "name": "A String", # Name of the information type. Either a name of your choosing when 4065 # creating a CustomInfoType, or one of the names listed 4066 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 4067 # a built-in type. InfoType names should conform to the pattern 4068 # [a-zA-Z0-9_]{1,64}. 4069 }, 4070 "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from 4071 # the distribution of values in the input data 4072 # empty messages in your APIs. A typical example is to use it as the request 4073 # or the response type of an API method. For instance: 4074 # 4075 # service Foo { 4076 # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); 4077 # } 4078 # 4079 # The JSON representation for `Empty` is empty JSON object `{}`. 4080 }, 4081 }, 4082 ], 4083 "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag 4084 # used to tag a quasi-identifiers column must appear in exactly one column 4085 # of one auxiliary table. 4086 { # An auxiliary table contains statistical information on the relative 4087 # frequency of different quasi-identifiers values. It has one or several 4088 # quasi-identifiers columns, and one column that indicates the relative 4089 # frequency of each quasi-identifier tuple. 4090 # If a tuple is present in the data but not in the auxiliary table, the 4091 # corresponding relative frequency is assumed to be zero (and thus, the 4092 # tuple is highly reidentifiable). 4093 "relativeFrequency": { # General identifier of a data field in a storage service. # The relative frequency column must contain a floating-point number 4094 # between 0 and 1 (inclusive). Null values are assumed to be zero. 4095 # [required] 4096 "name": "A String", # Name describing the field. 4097 }, 4098 "quasiIds": [ # Quasi-identifier columns. [required] 4099 { # A quasi-identifier column has a custom_tag, used to know which column 4100 # in the data corresponds to which column in the statistical model. 4101 "field": { # General identifier of a data field in a storage service. 4102 "name": "A String", # Name describing the field. 4103 }, 4104 "customTag": "A String", 4105 }, 4106 ], 4107 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Auxiliary table location. [required] 4108 # identified by its project_id, dataset_id, and table_name. Within a query 4109 # a table is often referenced with a string in the format of: 4110 # `<project_id>:<dataset_id>.<table_id>` or 4111 # `<project_id>.<dataset_id>.<table_id>`. 4112 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. 4113 # If omitted, project ID is inferred from the API call. 4114 "tableId": "A String", # Name of the table. 4115 "datasetId": "A String", # Dataset ID of the table. 4116 }, 4117 }, 4118 ], 4119 }, 4120 "lDiversityConfig": { # l-diversity metric, used for analysis of reidentification risk. 4121 "sensitiveAttribute": { # General identifier of a data field in a storage service. # Sensitive field for computing the l-value. 4122 "name": "A String", # Name describing the field. 4123 }, 4124 "quasiIds": [ # Set of quasi-identifiers indicating how equivalence classes are 4125 # defined for the l-diversity computation. When multiple fields are 4126 # specified, they are considered a single composite key. 4127 { # General identifier of a data field in a storage service. 4128 "name": "A String", # Name describing the field. 4129 }, 4130 ], 4131 }, 4132 "deltaPresenceEstimationConfig": { # δ-presence metric, used to estimate how likely it is for an attacker to 4133 # figure out that one given individual appears in a de-identified dataset. 4134 # Similarly to the k-map metric, we cannot compute δ-presence exactly without 4135 # knowing the attack dataset, so we use a statistical model instead. 4136 "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling. 4137 # Required if no column is tagged with a region-specific InfoType (like 4138 # US_ZIP_5) or a region code. 4139 "quasiIds": [ # Fields considered to be quasi-identifiers. No two fields can have the 4140 # same tag. [required] 4141 { # A column with a semantic tag attached. 4142 "field": { # General identifier of a data field in a storage service. # Identifies the column. [required] 4143 "name": "A String", # Name describing the field. 4144 }, 4145 "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must 4146 # indicate an auxiliary table that contains statistical information on 4147 # the possible values of this column (below). 4148 "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public 4149 # dataset as a statistical model of population, if available. We 4150 # currently support US ZIP codes, region codes, ages and genders. 4151 # To programmatically obtain the list of supported InfoTypes, use 4152 # ListInfoTypes with the supported_by=RISK_ANALYSIS filter. 4153 "name": "A String", # Name of the information type. Either a name of your choosing when 4154 # creating a CustomInfoType, or one of the names listed 4155 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 4156 # a built-in type. InfoType names should conform to the pattern 4157 # [a-zA-Z0-9_]{1,64}. 4158 }, 4159 "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from 4160 # the distribution of values in the input data 4161 # empty messages in your APIs. A typical example is to use it as the request 4162 # or the response type of an API method. For instance: 4163 # 4164 # service Foo { 4165 # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); 4166 # } 4167 # 4168 # The JSON representation for `Empty` is empty JSON object `{}`. 4169 }, 4170 }, 4171 ], 4172 "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag 4173 # used to tag a quasi-identifiers field must appear in exactly one 4174 # field of one auxiliary table. 4175 { # An auxiliary table containing statistical information on the relative 4176 # frequency of different quasi-identifiers values. It has one or several 4177 # quasi-identifiers columns, and one column that indicates the relative 4178 # frequency of each quasi-identifier tuple. 4179 # If a tuple is present in the data but not in the auxiliary table, the 4180 # corresponding relative frequency is assumed to be zero (and thus, the 4181 # tuple is highly reidentifiable). 4182 "relativeFrequency": { # General identifier of a data field in a storage service. # The relative frequency column must contain a floating-point number 4183 # between 0 and 1 (inclusive). Null values are assumed to be zero. 4184 # [required] 4185 "name": "A String", # Name describing the field. 4186 }, 4187 "quasiIds": [ # Quasi-identifier columns. [required] 4188 { # A quasi-identifier column has a custom_tag, used to know which column 4189 # in the data corresponds to which column in the statistical model. 4190 "field": { # General identifier of a data field in a storage service. 4191 "name": "A String", # Name describing the field. 4192 }, 4193 "customTag": "A String", 4194 }, 4195 ], 4196 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Auxiliary table location. [required] 4197 # identified by its project_id, dataset_id, and table_name. Within a query 4198 # a table is often referenced with a string in the format of: 4199 # `<project_id>:<dataset_id>.<table_id>` or 4200 # `<project_id>.<dataset_id>.<table_id>`. 4201 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. 4202 # If omitted, project ID is inferred from the API call. 4203 "tableId": "A String", # Name of the table. 4204 "datasetId": "A String", # Dataset ID of the table. 4205 }, 4206 }, 4207 ], 4208 }, 4209 "categoricalStatsConfig": { # Compute numerical stats over an individual column, including 4210 # number of distinct values and value count distribution. 4211 "field": { # General identifier of a data field in a storage service. # Field to compute categorical stats on. All column types are 4212 # supported except for arrays and structs. However, it may be more 4213 # informative to use NumericalStats when the field type is supported, 4214 # depending on the data. 4215 "name": "A String", # Name describing the field. 4216 }, 4217 }, 4218 "kAnonymityConfig": { # k-anonymity metric, used for analysis of reidentification risk. 4219 "entityId": { # An entity in a dataset is a field or set of fields that correspond to a # Optional message indicating that multiple rows might be associated to a 4220 # single individual. If the same entity_id is associated to multiple 4221 # quasi-identifier tuples over distinct rows, we consider the entire 4222 # collection of tuples as the composite quasi-identifier. This collection 4223 # is a multiset: the order in which the different tuples appear in the 4224 # dataset is ignored, but their frequency is taken into account. 4225 # 4226 # Important note: a maximum of 1000 rows can be associated to a single 4227 # entity ID. If more rows are associated with the same entity ID, some 4228 # might be ignored. 4229 # single person. For example, in medical records the `EntityId` might be a 4230 # patient identifier, or for financial records it might be an account 4231 # identifier. This message is used when generalizations or analysis must take 4232 # into account that multiple rows correspond to the same entity. 4233 "field": { # General identifier of a data field in a storage service. # Composite key indicating which field contains the entity identifier. 4234 "name": "A String", # Name describing the field. 4235 }, 4236 }, 4237 "quasiIds": [ # Set of fields to compute k-anonymity over. When multiple fields are 4238 # specified, they are considered a single composite key. Structs and 4239 # repeated data types are not supported; however, nested fields are 4240 # supported so long as they are not structs themselves or nested within 4241 # a repeated field. 4242 { # General identifier of a data field in a storage service. 4243 "name": "A String", # Name describing the field. 4244 }, 4245 ], 4246 }, 4247 }, 4248 "categoricalStatsResult": { # Result of the categorical stats computation. 4249 "valueFrequencyHistogramBuckets": [ # Histogram of value frequencies in the column. 4250 { 4251 "bucketValues": [ # Sample of value frequencies in this bucket. The total number of 4252 # values returned per bucket is capped at 20. 4253 { # A value of a field, including its frequency. 4254 "count": "A String", # How many times the value is contained in the field. 4255 "value": { # Set of primitive values supported by the system. # A value contained in the field in question. 4256 # Note that for the purposes of inspection or transformation, the number 4257 # of bytes considered to comprise a 'Value' is based on its representation 4258 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 4259 # 123456789, the number of bytes would be counted as 9, even though an 4260 # int64 only holds up to 8 bytes of data. 4261 "floatValue": 3.14, 4262 "timestampValue": "A String", 4263 "dayOfWeekValue": "A String", 4264 "timeValue": { # Represents a time of day. The date and time zone are either not significant 4265 # or are specified elsewhere. An API may choose to allow leap seconds. Related 4266 # types are google.type.Date and `google.protobuf.Timestamp`. 4267 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 4268 # to allow the value "24:00:00" for scenarios like business closing time. 4269 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 4270 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 4271 # allow the value 60 if it allows leap-seconds. 4272 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 4273 }, 4274 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 4275 # and time zone are either specified elsewhere or are not significant. The date 4276 # is relative to the Proleptic Gregorian Calendar. This can represent: 4277 # 4278 # * A full date, with non-zero year, month and day values 4279 # * A month and day value, with a zero year, e.g. an anniversary 4280 # * A year on its own, with zero month and day values 4281 # * A year and month value, with a zero day, e.g. a credit card expiration date 4282 # 4283 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 4284 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 4285 # a year. 4286 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 4287 # if specifying a year by itself or a year and month where the day is not 4288 # significant. 4289 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 4290 # month and day. 4291 }, 4292 "stringValue": "A String", 4293 "booleanValue": True or False, 4294 "integerValue": "A String", 4295 }, 4296 }, 4297 ], 4298 "bucketValueCount": "A String", # Total number of distinct values in this bucket. 4299 "valueFrequencyUpperBound": "A String", # Upper bound on the value frequency of the values in this bucket. 4300 "valueFrequencyLowerBound": "A String", # Lower bound on the value frequency of the values in this bucket. 4301 "bucketSize": "A String", # Total number of values in this bucket. 4302 }, 4303 ], 4304 }, 4305 "deltaPresenceEstimationResult": { # Result of the δ-presence computation. Note that these results are an 4306 # estimation, not exact values. 4307 "deltaPresenceEstimationHistogram": [ # The intervals [min_probability, max_probability) do not overlap. If a 4308 # value doesn't correspond to any such interval, the associated frequency 4309 # is zero. For example, the following records: 4310 # {min_probability: 0, max_probability: 0.1, frequency: 17} 4311 # {min_probability: 0.2, max_probability: 0.3, frequency: 42} 4312 # {min_probability: 0.3, max_probability: 0.4, frequency: 99} 4313 # mean that there are no record with an estimated probability in [0.1, 0.2) 4314 # nor larger or equal to 0.4. 4315 { # A DeltaPresenceEstimationHistogramBucket message with the following 4316 # values: 4317 # min_probability: 0.1 4318 # max_probability: 0.2 4319 # frequency: 42 4320 # means that there are 42 records for which δ is in [0.1, 0.2). An 4321 # important particular case is when min_probability = max_probability = 1: 4322 # then, every individual who shares this quasi-identifier combination is in 4323 # the dataset. 4324 "bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total 4325 # number of classes returned per bucket is capped at 20. 4326 { # A tuple of values for the quasi-identifier columns. 4327 "quasiIdsValues": [ # The quasi-identifier values. 4328 { # Set of primitive values supported by the system. 4329 # Note that for the purposes of inspection or transformation, the number 4330 # of bytes considered to comprise a 'Value' is based on its representation 4331 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 4332 # 123456789, the number of bytes would be counted as 9, even though an 4333 # int64 only holds up to 8 bytes of data. 4334 "floatValue": 3.14, 4335 "timestampValue": "A String", 4336 "dayOfWeekValue": "A String", 4337 "timeValue": { # Represents a time of day. The date and time zone are either not significant 4338 # or are specified elsewhere. An API may choose to allow leap seconds. Related 4339 # types are google.type.Date and `google.protobuf.Timestamp`. 4340 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 4341 # to allow the value "24:00:00" for scenarios like business closing time. 4342 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 4343 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 4344 # allow the value 60 if it allows leap-seconds. 4345 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 4346 }, 4347 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 4348 # and time zone are either specified elsewhere or are not significant. The date 4349 # is relative to the Proleptic Gregorian Calendar. This can represent: 4350 # 4351 # * A full date, with non-zero year, month and day values 4352 # * A month and day value, with a zero year, e.g. an anniversary 4353 # * A year on its own, with zero month and day values 4354 # * A year and month value, with a zero day, e.g. a credit card expiration date 4355 # 4356 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 4357 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 4358 # a year. 4359 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 4360 # if specifying a year by itself or a year and month where the day is not 4361 # significant. 4362 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 4363 # month and day. 4364 }, 4365 "stringValue": "A String", 4366 "booleanValue": True or False, 4367 "integerValue": "A String", 4368 }, 4369 ], 4370 "estimatedProbability": 3.14, # The estimated probability that a given individual sharing these 4371 # quasi-identifier values is in the dataset. This value, typically called 4372 # δ, is the ratio between the number of records in the dataset with these 4373 # quasi-identifier values, and the total number of individuals (inside 4374 # *and* outside the dataset) with these quasi-identifier values. 4375 # For example, if there are 15 individuals in the dataset who share the 4376 # same quasi-identifier values, and an estimated 100 people in the entire 4377 # population with these values, then δ is 0.15. 4378 }, 4379 ], 4380 "bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket. 4381 "bucketSize": "A String", # Number of records within these probability bounds. 4382 "maxProbability": 3.14, # Always greater than or equal to min_probability. 4383 "minProbability": 3.14, # Between 0 and 1. 4384 }, 4385 ], 4386 }, 4387 "requestedSourceTable": { # Message defining the location of a BigQuery table. A table is uniquely # Input dataset to compute metrics over. 4388 # identified by its project_id, dataset_id, and table_name. Within a query 4389 # a table is often referenced with a string in the format of: 4390 # `<project_id>:<dataset_id>.<table_id>` or 4391 # `<project_id>.<dataset_id>.<table_id>`. 4392 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. 4393 # If omitted, project ID is inferred from the API call. 4394 "tableId": "A String", # Name of the table. 4395 "datasetId": "A String", # Dataset ID of the table. 4396 }, 4397 }, 4398 "state": "A String", # State of a job. 4399 "jobTriggerName": "A String", # If created by a job trigger, the resource name of the trigger that 4400 # instantiated the job. 4401 "startTime": "A String", # Time when the job started. 4402 "endTime": "A String", # Time when the job finished. 4403 "type": "A String", # The type of job. 4404 "createTime": "A String", # Time when the job was created. 4405 }</pre> 4406</div> 4407 4408<div class="method"> 4409 <code class="details" id="list">list(parent, orderBy=None, type=None, pageSize=None, pageToken=None, x__xgafv=None, filter=None)</code> 4410 <pre>Lists DlpJobs that match the specified filter in the request. 4411See https://cloud.google.com/dlp/docs/inspecting-storage and 4412https://cloud.google.com/dlp/docs/compute-risk-analysis to learn more. 4413 4414Args: 4415 parent: string, The parent resource name, for example projects/my-project-id. (required) 4416 orderBy: string, Optional comma separated list of fields to order by, 4417followed by `asc` or `desc` postfix. This list is case-insensitive, 4418default sorting order is ascending, redundant space characters are 4419insignificant. 4420 4421Example: `name asc, end_time asc, create_time desc` 4422 4423Supported fields are: 4424 4425- `create_time`: corresponds to time the job was created. 4426- `end_time`: corresponds to time the job ended. 4427- `name`: corresponds to job's name. 4428- `state`: corresponds to `state` 4429 type: string, The type of job. Defaults to `DlpJobType.INSPECT` 4430 pageSize: integer, The standard list page size. 4431 pageToken: string, The standard list page token. 4432 x__xgafv: string, V1 error format. 4433 Allowed values 4434 1 - v1 error format 4435 2 - v2 error format 4436 filter: string, Optional. Allows filtering. 4437 4438Supported syntax: 4439 4440* Filter expressions are made up of one or more restrictions. 4441* Restrictions can be combined by `AND` or `OR` logical operators. A 4442sequence of restrictions implicitly uses `AND`. 4443* A restriction has the form of `<field> <operator> <value>`. 4444* Supported fields/values for inspect jobs: 4445 - `state` - PENDING|RUNNING|CANCELED|FINISHED|FAILED 4446 - `inspected_storage` - DATASTORE|CLOUD_STORAGE|BIGQUERY 4447 - `trigger_name` - The resource name of the trigger that created job. 4448 - 'end_time` - Corresponds to time the job finished. 4449 - 'start_time` - Corresponds to time the job finished. 4450* Supported fields for risk analysis jobs: 4451 - `state` - RUNNING|CANCELED|FINISHED|FAILED 4452 - 'end_time` - Corresponds to time the job finished. 4453 - 'start_time` - Corresponds to time the job finished. 4454* The operator must be `=` or `!=`. 4455 4456Examples: 4457 4458* inspected_storage = cloud_storage AND state = done 4459* inspected_storage = cloud_storage OR inspected_storage = bigquery 4460* inspected_storage = cloud_storage AND (state = done OR state = canceled) 4461* end_time > \"2017-12-12T00:00:00+00:00\" 4462 4463The length of this field should be no more than 500 characters. 4464 4465Returns: 4466 An object of the form: 4467 4468 { # The response message for listing DLP jobs. 4469 "nextPageToken": "A String", # The standard List next-page token. 4470 "jobs": [ # A list of DlpJobs that matches the specified filter in the request. 4471 { # Combines all of the information about a DLP job. 4472 "errors": [ # A stream of errors encountered running the job. 4473 { # Details information about an error encountered during job execution or 4474 # the results of an unsuccessful activation of the JobTrigger. 4475 # Output only field. 4476 "timestamps": [ # The times the error occurred. 4477 "A String", 4478 ], 4479 "details": { # The `Status` type defines a logical error model that is suitable for 4480 # different programming environments, including REST APIs and RPC APIs. It is 4481 # used by [gRPC](https://github.com/grpc). Each `Status` message contains 4482 # three pieces of data: error code, error message, and error details. 4483 # 4484 # You can find out more about this error model and how to work with it in the 4485 # [API Design Guide](https://cloud.google.com/apis/design/errors). 4486 "message": "A String", # A developer-facing error message, which should be in English. Any 4487 # user-facing error message should be localized and sent in the 4488 # google.rpc.Status.details field, or localized by the client. 4489 "code": 42, # The status code, which should be an enum value of google.rpc.Code. 4490 "details": [ # A list of messages that carry the error details. There is a common set of 4491 # message types for APIs to use. 4492 { 4493 "a_key": "", # Properties of the object. Contains field @type with type URL. 4494 }, 4495 ], 4496 }, 4497 }, 4498 ], 4499 "name": "A String", # The server-assigned name. 4500 "inspectDetails": { # The results of an inspect DataSource job. # Results from inspecting a data source. 4501 "requestedOptions": { # The configuration used for this job. 4502 "snapshotInspectTemplate": { # The inspectTemplate contains a configuration (set of types of sensitive data # If run with an InspectTemplate, a snapshot of its state at the time of 4503 # this run. 4504 # to be detected) to be used anywhere you otherwise would normally specify 4505 # InspectConfig. See https://cloud.google.com/dlp/docs/concepts-templates 4506 # to learn more. 4507 "updateTime": "A String", # The last update timestamp of a inspectTemplate, output only field. 4508 "displayName": "A String", # Display name (max 256 chars). 4509 "description": "A String", # Short description (max 256 chars). 4510 "inspectConfig": { # Configuration description of the scanning process. # The core content of the template. Configuration of the scanning process. 4511 # When used with redactContent only info_types and min_likelihood are currently 4512 # used. 4513 "excludeInfoTypes": True or False, # When true, excludes type information of the findings. 4514 "limits": { 4515 "maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job. 4516 # When set within `InspectContentRequest`, the maximum returned is 2000 4517 # regardless if this is set higher. 4518 "maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes. 4519 { # Max findings configuration per infoType, per content item or long 4520 # running DlpJob. 4521 "infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per 4522 # info_type should be provided. If InfoTypeLimit does not have an 4523 # info_type, the DLP API applies the limit against all info_types that 4524 # are found but not specified in another InfoTypeLimit. 4525 "name": "A String", # Name of the information type. Either a name of your choosing when 4526 # creating a CustomInfoType, or one of the names listed 4527 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 4528 # a built-in type. InfoType names should conform to the pattern 4529 # [a-zA-Z0-9_]{1,64}. 4530 }, 4531 "maxFindings": 42, # Max findings limit for the given infoType. 4532 }, 4533 ], 4534 "maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned. 4535 # When set within `InspectDataSourceRequest`, 4536 # the maximum returned is 2000 regardless if this is set higher. 4537 # When set within `InspectContentRequest`, this field is ignored. 4538 }, 4539 "minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is 4540 # POSSIBLE. 4541 # See https://cloud.google.com/dlp/docs/likelihood to learn more. 4542 "customInfoTypes": [ # CustomInfoTypes provided by the user. See 4543 # https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more. 4544 { # Custom information type provided by the user. Used to find domain-specific 4545 # sensitive information configurable to the data in question. 4546 "regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType. 4547 "pattern": "A String", # Pattern defining the regular expression. Its syntax 4548 # (https://github.com/google/re2/wiki/Syntax) can be found under the 4549 # google/re2 repository on GitHub. 4550 "groupIndexes": [ # The index of the submatch to extract as findings. When not 4551 # specified, the entire match is returned. No more than 3 may be included. 4552 42, 4553 ], 4554 }, 4555 "surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that 4556 # support reversing. 4557 # such as 4558 # [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig). 4559 # These types of transformations are 4560 # those that perform pseudonymization, thereby producing a "surrogate" as 4561 # output. This should be used in conjunction with a field on the 4562 # transformation such as `surrogate_info_type`. This CustomInfoType does 4563 # not support the use of `detection_rules`. 4564 }, 4565 "infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in 4566 # infoType, when the name matches one of existing infoTypes and that infoType 4567 # is specified in `InspectContent.info_types` field. Specifying the latter 4568 # adds findings to the one detected by the system. If built-in info type is 4569 # not specified in `InspectContent.info_types` list then the name is treated 4570 # as a custom info type. 4571 "name": "A String", # Name of the information type. Either a name of your choosing when 4572 # creating a CustomInfoType, or one of the names listed 4573 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 4574 # a built-in type. InfoType names should conform to the pattern 4575 # [a-zA-Z0-9_]{1,64}. 4576 }, 4577 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType. 4578 # be used to match sensitive information specific to the data, such as a list 4579 # of employee IDs or job titles. 4580 # 4581 # Dictionary words are case-insensitive and all characters other than letters 4582 # and digits in the unicode [Basic Multilingual 4583 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) 4584 # will be replaced with whitespace when scanning for matches, so the 4585 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", 4586 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters 4587 # surrounding any match must be of a different type than the adjacent 4588 # characters within the word, so letters must be next to non-letters and 4589 # digits next to non-digits. For example, the dictionary word "jen" will 4590 # match the first three letters of the text "jen123" but will return no 4591 # matches for "jennifer". 4592 # 4593 # Dictionary words containing a large number of characters that are not 4594 # letters or digits may result in unexpected findings because such characters 4595 # are treated as whitespace. The 4596 # [limits](https://cloud.google.com/dlp/limits) page contains details about 4597 # the size limits of dictionaries. For dictionaries that do not fit within 4598 # these constraints, consider using `LargeCustomDictionaryConfig` in the 4599 # `StoredInfoType` API. 4600 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. 4601 "words": [ # Words or phrases defining the dictionary. The dictionary must contain 4602 # at least one phrase and every phrase must contain at least 2 characters 4603 # that are letters or digits. [required] 4604 "A String", 4605 ], 4606 }, 4607 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file 4608 # is accepted. 4609 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. 4610 # Example: gs://[BUCKET_NAME]/dictionary.txt 4611 }, 4612 }, 4613 "storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in 4614 # `InspectDataSource`. Not currently supported in `InspectContent`. 4615 "name": "A String", # Resource name of the requested `StoredInfoType`, for example 4616 # `organizations/433245324/storedInfoTypes/432452342` or 4617 # `projects/project-id/storedInfoTypes/432452342`. 4618 "createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for 4619 # inspection was created. Output-only field, populated by the system. 4620 }, 4621 "detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType. 4622 # Rules are applied in order that they are specified. Not supported for the 4623 # `surrogate_type` CustomInfoType. 4624 { # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a 4625 # `CustomInfoType` to alter behavior under certain circumstances, depending 4626 # on the specific details of the rule. Not supported for the `surrogate_type` 4627 # custom infoType. 4628 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. 4629 # proximity of hotwords. 4630 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. 4631 # The total length of the window cannot exceed 1000 characters. Note that 4632 # the finding itself will be included in the window, so that hotwords may 4633 # be used to match substrings of the finding itself. For example, the 4634 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be 4635 # adjusted upwards if the area code is known to be the local area code of 4636 # a company office using the hotword regex "\(xxx\)", where "xxx" 4637 # is the area code in question. 4638 # rule. 4639 "windowAfter": 42, # Number of characters after the finding to consider. 4640 "windowBefore": 42, # Number of characters before the finding to consider. 4641 }, 4642 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. 4643 "pattern": "A String", # Pattern defining the regular expression. Its syntax 4644 # (https://github.com/google/re2/wiki/Syntax) can be found under the 4645 # google/re2 repository on GitHub. 4646 "groupIndexes": [ # The index of the submatch to extract as findings. When not 4647 # specified, the entire match is returned. No more than 3 may be included. 4648 42, 4649 ], 4650 }, 4651 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. 4652 # part of a detection rule. 4653 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of 4654 # levels. For example, if a finding would be `POSSIBLE` without the 4655 # detection rule and `relative_likelihood` is 1, then it is upgraded to 4656 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. 4657 # Likelihood may never drop below `VERY_UNLIKELY` or exceed 4658 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an 4659 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in 4660 # a final likelihood of `LIKELY`. 4661 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. 4662 }, 4663 }, 4664 }, 4665 ], 4666 "exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding 4667 # to be returned. It still can be used for rules matching. 4668 "likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be 4669 # altered by a detection rule if the finding meets the criteria specified by 4670 # the rule. Defaults to `VERY_LIKELY` if not specified. 4671 }, 4672 ], 4673 "includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is 4674 # included in the response; see Finding.quote. 4675 "ruleSet": [ # Set of rules to apply to the findings for this InspectConfig. 4676 # Exclusion rules, contained in the set are executed in the end, other 4677 # rules are executed in the order they are specified for each info type. 4678 { # Rule set for modifying a set of infoTypes to alter behavior under certain 4679 # circumstances, depending on the specific details of the rules within the set. 4680 "rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order. 4681 { # A single inspection rule to be applied to infoTypes, specified in 4682 # `InspectionRuleSet`. 4683 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. 4684 # proximity of hotwords. 4685 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. 4686 # The total length of the window cannot exceed 1000 characters. Note that 4687 # the finding itself will be included in the window, so that hotwords may 4688 # be used to match substrings of the finding itself. For example, the 4689 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be 4690 # adjusted upwards if the area code is known to be the local area code of 4691 # a company office using the hotword regex "\(xxx\)", where "xxx" 4692 # is the area code in question. 4693 # rule. 4694 "windowAfter": 42, # Number of characters after the finding to consider. 4695 "windowBefore": 42, # Number of characters before the finding to consider. 4696 }, 4697 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. 4698 "pattern": "A String", # Pattern defining the regular expression. Its syntax 4699 # (https://github.com/google/re2/wiki/Syntax) can be found under the 4700 # google/re2 repository on GitHub. 4701 "groupIndexes": [ # The index of the submatch to extract as findings. When not 4702 # specified, the entire match is returned. No more than 3 may be included. 4703 42, 4704 ], 4705 }, 4706 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. 4707 # part of a detection rule. 4708 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of 4709 # levels. For example, if a finding would be `POSSIBLE` without the 4710 # detection rule and `relative_likelihood` is 1, then it is upgraded to 4711 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. 4712 # Likelihood may never drop below `VERY_UNLIKELY` or exceed 4713 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an 4714 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in 4715 # a final likelihood of `LIKELY`. 4716 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. 4717 }, 4718 }, 4719 "exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule. 4720 # `InspectionRuleSet` are removed from results. 4721 "regex": { # Message defining a custom regular expression. # Regular expression which defines the rule. 4722 "pattern": "A String", # Pattern defining the regular expression. Its syntax 4723 # (https://github.com/google/re2/wiki/Syntax) can be found under the 4724 # google/re2 repository on GitHub. 4725 "groupIndexes": [ # The index of the submatch to extract as findings. When not 4726 # specified, the entire match is returned. No more than 3 may be included. 4727 42, 4728 ], 4729 }, 4730 "excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule. 4731 "infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or 4732 # contained within with a finding of an infoType from this list. For 4733 # example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and 4734 # `exclusion_rule` containing `exclude_info_types.info_types` with 4735 # "EMAIL_ADDRESS" the phone number findings are dropped if they overlap 4736 # with EMAIL_ADDRESS finding. 4737 # That leads to "555-222-2222@example.org" to generate only a single 4738 # finding, namely email address. 4739 { # Type of information detected by the API. 4740 "name": "A String", # Name of the information type. Either a name of your choosing when 4741 # creating a CustomInfoType, or one of the names listed 4742 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 4743 # a built-in type. InfoType names should conform to the pattern 4744 # [a-zA-Z0-9_]{1,64}. 4745 }, 4746 ], 4747 }, 4748 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule. 4749 # be used to match sensitive information specific to the data, such as a list 4750 # of employee IDs or job titles. 4751 # 4752 # Dictionary words are case-insensitive and all characters other than letters 4753 # and digits in the unicode [Basic Multilingual 4754 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) 4755 # will be replaced with whitespace when scanning for matches, so the 4756 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", 4757 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters 4758 # surrounding any match must be of a different type than the adjacent 4759 # characters within the word, so letters must be next to non-letters and 4760 # digits next to non-digits. For example, the dictionary word "jen" will 4761 # match the first three letters of the text "jen123" but will return no 4762 # matches for "jennifer". 4763 # 4764 # Dictionary words containing a large number of characters that are not 4765 # letters or digits may result in unexpected findings because such characters 4766 # are treated as whitespace. The 4767 # [limits](https://cloud.google.com/dlp/limits) page contains details about 4768 # the size limits of dictionaries. For dictionaries that do not fit within 4769 # these constraints, consider using `LargeCustomDictionaryConfig` in the 4770 # `StoredInfoType` API. 4771 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. 4772 "words": [ # Words or phrases defining the dictionary. The dictionary must contain 4773 # at least one phrase and every phrase must contain at least 2 characters 4774 # that are letters or digits. [required] 4775 "A String", 4776 ], 4777 }, 4778 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file 4779 # is accepted. 4780 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. 4781 # Example: gs://[BUCKET_NAME]/dictionary.txt 4782 }, 4783 }, 4784 "matchingType": "A String", # How the rule is applied, see MatchingType documentation for details. 4785 }, 4786 }, 4787 ], 4788 "infoTypes": [ # List of infoTypes this rule set is applied to. 4789 { # Type of information detected by the API. 4790 "name": "A String", # Name of the information type. Either a name of your choosing when 4791 # creating a CustomInfoType, or one of the names listed 4792 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 4793 # a built-in type. InfoType names should conform to the pattern 4794 # [a-zA-Z0-9_]{1,64}. 4795 }, 4796 ], 4797 }, 4798 ], 4799 "contentOptions": [ # List of options defining data content to scan. 4800 # If empty, text, images, and other content will be included. 4801 "A String", 4802 ], 4803 "infoTypes": [ # Restricts what info_types to look for. The values must correspond to 4804 # InfoType values returned by ListInfoTypes or listed at 4805 # https://cloud.google.com/dlp/docs/infotypes-reference. 4806 # 4807 # When no InfoTypes or CustomInfoTypes are specified in a request, the 4808 # system may automatically choose what detectors to run. By default this may 4809 # be all types, but may change over time as detectors are updated. 4810 # 4811 # The special InfoType name "ALL_BASIC" can be used to trigger all detectors, 4812 # but may change over time as new InfoTypes are added. If you need precise 4813 # control and predictability as to what detectors are run you should specify 4814 # specific InfoTypes listed in the reference. 4815 { # Type of information detected by the API. 4816 "name": "A String", # Name of the information type. Either a name of your choosing when 4817 # creating a CustomInfoType, or one of the names listed 4818 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 4819 # a built-in type. InfoType names should conform to the pattern 4820 # [a-zA-Z0-9_]{1,64}. 4821 }, 4822 ], 4823 }, 4824 "createTime": "A String", # The creation timestamp of a inspectTemplate, output only field. 4825 "name": "A String", # The template name. Output only. 4826 # 4827 # The template will have one of the following formats: 4828 # `projects/PROJECT_ID/inspectTemplates/TEMPLATE_ID` OR 4829 # `organizations/ORGANIZATION_ID/inspectTemplates/TEMPLATE_ID` 4830 }, 4831 "jobConfig": { 4832 "storageConfig": { # Shared message indicating Cloud storage type. # The data to scan. 4833 "datastoreOptions": { # Options defining a data set within Google Cloud Datastore. # Google Cloud Datastore options specification. 4834 "partitionId": { # Datastore partition ID. # A partition ID identifies a grouping of entities. The grouping is always 4835 # by project and namespace, however the namespace ID may be empty. 4836 # A partition ID identifies a grouping of entities. The grouping is always 4837 # by project and namespace, however the namespace ID may be empty. 4838 # 4839 # A partition ID contains several dimensions: 4840 # project ID and namespace ID. 4841 "projectId": "A String", # The ID of the project to which the entities belong. 4842 "namespaceId": "A String", # If not empty, the ID of the namespace to which the entities belong. 4843 }, 4844 "kind": { # A representation of a Datastore kind. # The kind to process. 4845 "name": "A String", # The name of the kind. 4846 }, 4847 }, 4848 "bigQueryOptions": { # Options defining BigQuery table and row identifiers. # BigQuery options specification. 4849 "excludedFields": [ # References to fields excluded from scanning. This allows you to skip 4850 # inspection of entire columns which you know have no findings. 4851 { # General identifier of a data field in a storage service. 4852 "name": "A String", # Name describing the field. 4853 }, 4854 ], 4855 "rowsLimit": "A String", # Max number of rows to scan. If the table has more rows than this value, the 4856 # rest of the rows are omitted. If not set, or if set to 0, all rows will be 4857 # scanned. Only one of rows_limit and rows_limit_percent can be specified. 4858 # Cannot be used in conjunction with TimespanConfig. 4859 "sampleMethod": "A String", 4860 "identifyingFields": [ # References to fields uniquely identifying rows within the table. 4861 # Nested fields in the format, like `person.birthdate.year`, are allowed. 4862 { # General identifier of a data field in a storage service. 4863 "name": "A String", # Name describing the field. 4864 }, 4865 ], 4866 "rowsLimitPercent": 42, # Max percentage of rows to scan. The rest are omitted. The number of rows 4867 # scanned is rounded down. Must be between 0 and 100, inclusively. Both 0 and 4868 # 100 means no limit. Defaults to 0. Only one of rows_limit and 4869 # rows_limit_percent can be specified. Cannot be used in conjunction with 4870 # TimespanConfig. 4871 "tableReference": { # Message defining the location of a BigQuery table. A table is uniquely # Complete BigQuery table reference. 4872 # identified by its project_id, dataset_id, and table_name. Within a query 4873 # a table is often referenced with a string in the format of: 4874 # `<project_id>:<dataset_id>.<table_id>` or 4875 # `<project_id>.<dataset_id>.<table_id>`. 4876 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. 4877 # If omitted, project ID is inferred from the API call. 4878 "tableId": "A String", # Name of the table. 4879 "datasetId": "A String", # Dataset ID of the table. 4880 }, 4881 }, 4882 "timespanConfig": { # Configuration of the timespan of the items to include in scanning. 4883 # Currently only supported when inspecting Google Cloud Storage and BigQuery. 4884 "timestampField": { # General identifier of a data field in a storage service. # Specification of the field containing the timestamp of scanned items. 4885 # Used for data sources like Datastore or BigQuery. 4886 # If not specified for BigQuery, table last modification timestamp 4887 # is checked against given time span. 4888 # The valid data types of the timestamp field are: 4889 # for BigQuery - timestamp, date, datetime; 4890 # for Datastore - timestamp. 4891 # Datastore entity will be scanned if the timestamp property does not exist 4892 # or its value is empty or invalid. 4893 "name": "A String", # Name describing the field. 4894 }, 4895 "endTime": "A String", # Exclude files or rows newer than this value. 4896 # If set to zero, no upper time limit is applied. 4897 "startTime": "A String", # Exclude files or rows older than this value. 4898 "enableAutoPopulationOfTimespanConfig": True or False, # When the job is started by a JobTrigger we will automatically figure out 4899 # a valid start_time to avoid scanning files that have not been modified 4900 # since the last time the JobTrigger executed. This will be based on the 4901 # time of the execution of the last run of the JobTrigger. 4902 }, 4903 "cloudStorageOptions": { # Options defining a file or a set of files within a Google Cloud Storage # Google Cloud Storage options specification. 4904 # bucket. 4905 "bytesLimitPerFile": "A String", # Max number of bytes to scan from a file. If a scanned file's size is bigger 4906 # than this value then the rest of the bytes are omitted. Only one 4907 # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified. 4908 "sampleMethod": "A String", 4909 "fileSet": { # Set of files to scan. # The set of one or more files to scan. 4910 "url": "A String", # The Cloud Storage url of the file(s) to scan, in the format 4911 # `gs://<bucket>/<path>`. Trailing wildcard in the path is allowed. 4912 # 4913 # If the url ends in a trailing slash, the bucket or directory represented 4914 # by the url will be scanned non-recursively (content in sub-directories 4915 # will not be scanned). This means that `gs://mybucket/` is equivalent to 4916 # `gs://mybucket/*`, and `gs://mybucket/directory/` is equivalent to 4917 # `gs://mybucket/directory/*`. 4918 # 4919 # Exactly one of `url` or `regex_file_set` must be set. 4920 "regexFileSet": { # Message representing a set of files in a Cloud Storage bucket. Regular # The regex-filtered set of files to scan. Exactly one of `url` or 4921 # `regex_file_set` must be set. 4922 # expressions are used to allow fine-grained control over which files in the 4923 # bucket to include. 4924 # 4925 # Included files are those that match at least one item in `include_regex` and 4926 # do not match any items in `exclude_regex`. Note that a file that matches 4927 # items from both lists will _not_ be included. For a match to occur, the 4928 # entire file path (i.e., everything in the url after the bucket name) must 4929 # match the regular expression. 4930 # 4931 # For example, given the input `{bucket_name: "mybucket", include_regex: 4932 # ["directory1/.*"], exclude_regex: 4933 # ["directory1/excluded.*"]}`: 4934 # 4935 # * `gs://mybucket/directory1/myfile` will be included 4936 # * `gs://mybucket/directory1/directory2/myfile` will be included (`.*` matches 4937 # across `/`) 4938 # * `gs://mybucket/directory0/directory1/myfile` will _not_ be included (the 4939 # full path doesn't match any items in `include_regex`) 4940 # * `gs://mybucket/directory1/excludedfile` will _not_ be included (the path 4941 # matches an item in `exclude_regex`) 4942 # 4943 # If `include_regex` is left empty, it will match all files by default 4944 # (this is equivalent to setting `include_regex: [".*"]`). 4945 # 4946 # Some other common use cases: 4947 # 4948 # * `{bucket_name: "mybucket", exclude_regex: [".*\.pdf"]}` will include all 4949 # files in `mybucket` except for .pdf files 4950 # * `{bucket_name: "mybucket", include_regex: ["directory/[^/]+"]}` will 4951 # include all files directly under `gs://mybucket/directory/`, without matching 4952 # across `/` 4953 "excludeRegex": [ # A list of regular expressions matching file paths to exclude. All files in 4954 # the bucket that match at least one of these regular expressions will be 4955 # excluded from the scan. 4956 # 4957 # Regular expressions use RE2 4958 # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found 4959 # under the google/re2 repository on GitHub. 4960 "A String", 4961 ], 4962 "bucketName": "A String", # The name of a Cloud Storage bucket. Required. 4963 "includeRegex": [ # A list of regular expressions matching file paths to include. All files in 4964 # the bucket that match at least one of these regular expressions will be 4965 # included in the set of files, except for those that also match an item in 4966 # `exclude_regex`. Leaving this field empty will match all files by default 4967 # (this is equivalent to including `.*` in the list). 4968 # 4969 # Regular expressions use RE2 4970 # [syntax](https://github.com/google/re2/wiki/Syntax); a guide can be found 4971 # under the google/re2 repository on GitHub. 4972 "A String", 4973 ], 4974 }, 4975 }, 4976 "bytesLimitPerFilePercent": 42, # Max percentage of bytes to scan from a file. The rest are omitted. The 4977 # number of bytes scanned is rounded down. Must be between 0 and 100, 4978 # inclusively. Both 0 and 100 means no limit. Defaults to 0. Only one 4979 # of bytes_limit_per_file and bytes_limit_per_file_percent can be specified. 4980 "filesLimitPercent": 42, # Limits the number of files to scan to this percentage of the input FileSet. 4981 # Number of files scanned is rounded down. Must be between 0 and 100, 4982 # inclusively. Both 0 and 100 means no limit. Defaults to 0. 4983 "fileTypes": [ # List of file type groups to include in the scan. 4984 # If empty, all files are scanned and available data format processors 4985 # are applied. In addition, the binary content of the selected files 4986 # is always scanned as well. 4987 "A String", 4988 ], 4989 }, 4990 }, 4991 "inspectConfig": { # Configuration description of the scanning process. # How and what to scan for. 4992 # When used with redactContent only info_types and min_likelihood are currently 4993 # used. 4994 "excludeInfoTypes": True or False, # When true, excludes type information of the findings. 4995 "limits": { 4996 "maxFindingsPerRequest": 42, # Max number of findings that will be returned per request/job. 4997 # When set within `InspectContentRequest`, the maximum returned is 2000 4998 # regardless if this is set higher. 4999 "maxFindingsPerInfoType": [ # Configuration of findings limit given for specified infoTypes. 5000 { # Max findings configuration per infoType, per content item or long 5001 # running DlpJob. 5002 "infoType": { # Type of information detected by the API. # Type of information the findings limit applies to. Only one limit per 5003 # info_type should be provided. If InfoTypeLimit does not have an 5004 # info_type, the DLP API applies the limit against all info_types that 5005 # are found but not specified in another InfoTypeLimit. 5006 "name": "A String", # Name of the information type. Either a name of your choosing when 5007 # creating a CustomInfoType, or one of the names listed 5008 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 5009 # a built-in type. InfoType names should conform to the pattern 5010 # [a-zA-Z0-9_]{1,64}. 5011 }, 5012 "maxFindings": 42, # Max findings limit for the given infoType. 5013 }, 5014 ], 5015 "maxFindingsPerItem": 42, # Max number of findings that will be returned for each item scanned. 5016 # When set within `InspectDataSourceRequest`, 5017 # the maximum returned is 2000 regardless if this is set higher. 5018 # When set within `InspectContentRequest`, this field is ignored. 5019 }, 5020 "minLikelihood": "A String", # Only returns findings equal or above this threshold. The default is 5021 # POSSIBLE. 5022 # See https://cloud.google.com/dlp/docs/likelihood to learn more. 5023 "customInfoTypes": [ # CustomInfoTypes provided by the user. See 5024 # https://cloud.google.com/dlp/docs/creating-custom-infotypes to learn more. 5025 { # Custom information type provided by the user. Used to find domain-specific 5026 # sensitive information configurable to the data in question. 5027 "regex": { # Message defining a custom regular expression. # Regular expression based CustomInfoType. 5028 "pattern": "A String", # Pattern defining the regular expression. Its syntax 5029 # (https://github.com/google/re2/wiki/Syntax) can be found under the 5030 # google/re2 repository on GitHub. 5031 "groupIndexes": [ # The index of the submatch to extract as findings. When not 5032 # specified, the entire match is returned. No more than 3 may be included. 5033 42, 5034 ], 5035 }, 5036 "surrogateType": { # Message for detecting output from deidentification transformations # Message for detecting output from deidentification transformations that 5037 # support reversing. 5038 # such as 5039 # [`CryptoReplaceFfxFpeConfig`](/dlp/docs/reference/rest/v2/organizations.deidentifyTemplates#cryptoreplaceffxfpeconfig). 5040 # These types of transformations are 5041 # those that perform pseudonymization, thereby producing a "surrogate" as 5042 # output. This should be used in conjunction with a field on the 5043 # transformation such as `surrogate_info_type`. This CustomInfoType does 5044 # not support the use of `detection_rules`. 5045 }, 5046 "infoType": { # Type of information detected by the API. # CustomInfoType can either be a new infoType, or an extension of built-in 5047 # infoType, when the name matches one of existing infoTypes and that infoType 5048 # is specified in `InspectContent.info_types` field. Specifying the latter 5049 # adds findings to the one detected by the system. If built-in info type is 5050 # not specified in `InspectContent.info_types` list then the name is treated 5051 # as a custom info type. 5052 "name": "A String", # Name of the information type. Either a name of your choosing when 5053 # creating a CustomInfoType, or one of the names listed 5054 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 5055 # a built-in type. InfoType names should conform to the pattern 5056 # [a-zA-Z0-9_]{1,64}. 5057 }, 5058 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # A list of phrases to detect as a CustomInfoType. 5059 # be used to match sensitive information specific to the data, such as a list 5060 # of employee IDs or job titles. 5061 # 5062 # Dictionary words are case-insensitive and all characters other than letters 5063 # and digits in the unicode [Basic Multilingual 5064 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) 5065 # will be replaced with whitespace when scanning for matches, so the 5066 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", 5067 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters 5068 # surrounding any match must be of a different type than the adjacent 5069 # characters within the word, so letters must be next to non-letters and 5070 # digits next to non-digits. For example, the dictionary word "jen" will 5071 # match the first three letters of the text "jen123" but will return no 5072 # matches for "jennifer". 5073 # 5074 # Dictionary words containing a large number of characters that are not 5075 # letters or digits may result in unexpected findings because such characters 5076 # are treated as whitespace. The 5077 # [limits](https://cloud.google.com/dlp/limits) page contains details about 5078 # the size limits of dictionaries. For dictionaries that do not fit within 5079 # these constraints, consider using `LargeCustomDictionaryConfig` in the 5080 # `StoredInfoType` API. 5081 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. 5082 "words": [ # Words or phrases defining the dictionary. The dictionary must contain 5083 # at least one phrase and every phrase must contain at least 2 characters 5084 # that are letters or digits. [required] 5085 "A String", 5086 ], 5087 }, 5088 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file 5089 # is accepted. 5090 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. 5091 # Example: gs://[BUCKET_NAME]/dictionary.txt 5092 }, 5093 }, 5094 "storedType": { # A reference to a StoredInfoType to use with scanning. # Load an existing `StoredInfoType` resource for use in 5095 # `InspectDataSource`. Not currently supported in `InspectContent`. 5096 "name": "A String", # Resource name of the requested `StoredInfoType`, for example 5097 # `organizations/433245324/storedInfoTypes/432452342` or 5098 # `projects/project-id/storedInfoTypes/432452342`. 5099 "createTime": "A String", # Timestamp indicating when the version of the `StoredInfoType` used for 5100 # inspection was created. Output-only field, populated by the system. 5101 }, 5102 "detectionRules": [ # Set of detection rules to apply to all findings of this CustomInfoType. 5103 # Rules are applied in order that they are specified. Not supported for the 5104 # `surrogate_type` CustomInfoType. 5105 { # Deprecated; use `InspectionRuleSet` instead. Rule for modifying a 5106 # `CustomInfoType` to alter behavior under certain circumstances, depending 5107 # on the specific details of the rule. Not supported for the `surrogate_type` 5108 # custom infoType. 5109 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. 5110 # proximity of hotwords. 5111 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. 5112 # The total length of the window cannot exceed 1000 characters. Note that 5113 # the finding itself will be included in the window, so that hotwords may 5114 # be used to match substrings of the finding itself. For example, the 5115 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be 5116 # adjusted upwards if the area code is known to be the local area code of 5117 # a company office using the hotword regex "\(xxx\)", where "xxx" 5118 # is the area code in question. 5119 # rule. 5120 "windowAfter": 42, # Number of characters after the finding to consider. 5121 "windowBefore": 42, # Number of characters before the finding to consider. 5122 }, 5123 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. 5124 "pattern": "A String", # Pattern defining the regular expression. Its syntax 5125 # (https://github.com/google/re2/wiki/Syntax) can be found under the 5126 # google/re2 repository on GitHub. 5127 "groupIndexes": [ # The index of the submatch to extract as findings. When not 5128 # specified, the entire match is returned. No more than 3 may be included. 5129 42, 5130 ], 5131 }, 5132 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. 5133 # part of a detection rule. 5134 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of 5135 # levels. For example, if a finding would be `POSSIBLE` without the 5136 # detection rule and `relative_likelihood` is 1, then it is upgraded to 5137 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. 5138 # Likelihood may never drop below `VERY_UNLIKELY` or exceed 5139 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an 5140 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in 5141 # a final likelihood of `LIKELY`. 5142 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. 5143 }, 5144 }, 5145 }, 5146 ], 5147 "exclusionType": "A String", # If set to EXCLUSION_TYPE_EXCLUDE this infoType will not cause a finding 5148 # to be returned. It still can be used for rules matching. 5149 "likelihood": "A String", # Likelihood to return for this CustomInfoType. This base value can be 5150 # altered by a detection rule if the finding meets the criteria specified by 5151 # the rule. Defaults to `VERY_LIKELY` if not specified. 5152 }, 5153 ], 5154 "includeQuote": True or False, # When true, a contextual quote from the data that triggered a finding is 5155 # included in the response; see Finding.quote. 5156 "ruleSet": [ # Set of rules to apply to the findings for this InspectConfig. 5157 # Exclusion rules, contained in the set are executed in the end, other 5158 # rules are executed in the order they are specified for each info type. 5159 { # Rule set for modifying a set of infoTypes to alter behavior under certain 5160 # circumstances, depending on the specific details of the rules within the set. 5161 "rules": [ # Set of rules to be applied to infoTypes. The rules are applied in order. 5162 { # A single inspection rule to be applied to infoTypes, specified in 5163 # `InspectionRuleSet`. 5164 "hotwordRule": { # The rule that adjusts the likelihood of findings within a certain # Hotword-based detection rule. 5165 # proximity of hotwords. 5166 "proximity": { # Message for specifying a window around a finding to apply a detection # Proximity of the finding within which the entire hotword must reside. 5167 # The total length of the window cannot exceed 1000 characters. Note that 5168 # the finding itself will be included in the window, so that hotwords may 5169 # be used to match substrings of the finding itself. For example, the 5170 # certainty of a phone number regex "\(\d{3}\) \d{3}-\d{4}" could be 5171 # adjusted upwards if the area code is known to be the local area code of 5172 # a company office using the hotword regex "\(xxx\)", where "xxx" 5173 # is the area code in question. 5174 # rule. 5175 "windowAfter": 42, # Number of characters after the finding to consider. 5176 "windowBefore": 42, # Number of characters before the finding to consider. 5177 }, 5178 "hotwordRegex": { # Message defining a custom regular expression. # Regular expression pattern defining what qualifies as a hotword. 5179 "pattern": "A String", # Pattern defining the regular expression. Its syntax 5180 # (https://github.com/google/re2/wiki/Syntax) can be found under the 5181 # google/re2 repository on GitHub. 5182 "groupIndexes": [ # The index of the submatch to extract as findings. When not 5183 # specified, the entire match is returned. No more than 3 may be included. 5184 42, 5185 ], 5186 }, 5187 "likelihoodAdjustment": { # Message for specifying an adjustment to the likelihood of a finding as # Likelihood adjustment to apply to all matching findings. 5188 # part of a detection rule. 5189 "relativeLikelihood": 42, # Increase or decrease the likelihood by the specified number of 5190 # levels. For example, if a finding would be `POSSIBLE` without the 5191 # detection rule and `relative_likelihood` is 1, then it is upgraded to 5192 # `LIKELY`, while a value of -1 would downgrade it to `UNLIKELY`. 5193 # Likelihood may never drop below `VERY_UNLIKELY` or exceed 5194 # `VERY_LIKELY`, so applying an adjustment of 1 followed by an 5195 # adjustment of -1 when base likelihood is `VERY_LIKELY` will result in 5196 # a final likelihood of `LIKELY`. 5197 "fixedLikelihood": "A String", # Set the likelihood of a finding to a fixed value. 5198 }, 5199 }, 5200 "exclusionRule": { # The rule that specifies conditions when findings of infoTypes specified in # Exclusion rule. 5201 # `InspectionRuleSet` are removed from results. 5202 "regex": { # Message defining a custom regular expression. # Regular expression which defines the rule. 5203 "pattern": "A String", # Pattern defining the regular expression. Its syntax 5204 # (https://github.com/google/re2/wiki/Syntax) can be found under the 5205 # google/re2 repository on GitHub. 5206 "groupIndexes": [ # The index of the submatch to extract as findings. When not 5207 # specified, the entire match is returned. No more than 3 may be included. 5208 42, 5209 ], 5210 }, 5211 "excludeInfoTypes": { # List of exclude infoTypes. # Set of infoTypes for which findings would affect this rule. 5212 "infoTypes": [ # InfoType list in ExclusionRule rule drops a finding when it overlaps or 5213 # contained within with a finding of an infoType from this list. For 5214 # example, for `InspectionRuleSet.info_types` containing "PHONE_NUMBER"` and 5215 # `exclusion_rule` containing `exclude_info_types.info_types` with 5216 # "EMAIL_ADDRESS" the phone number findings are dropped if they overlap 5217 # with EMAIL_ADDRESS finding. 5218 # That leads to "555-222-2222@example.org" to generate only a single 5219 # finding, namely email address. 5220 { # Type of information detected by the API. 5221 "name": "A String", # Name of the information type. Either a name of your choosing when 5222 # creating a CustomInfoType, or one of the names listed 5223 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 5224 # a built-in type. InfoType names should conform to the pattern 5225 # [a-zA-Z0-9_]{1,64}. 5226 }, 5227 ], 5228 }, 5229 "dictionary": { # Custom information type based on a dictionary of words or phrases. This can # Dictionary which defines the rule. 5230 # be used to match sensitive information specific to the data, such as a list 5231 # of employee IDs or job titles. 5232 # 5233 # Dictionary words are case-insensitive and all characters other than letters 5234 # and digits in the unicode [Basic Multilingual 5235 # Plane](https://en.wikipedia.org/wiki/Plane_%28Unicode%29#Basic_Multilingual_Plane) 5236 # will be replaced with whitespace when scanning for matches, so the 5237 # dictionary phrase "Sam Johnson" will match all three phrases "sam johnson", 5238 # "Sam, Johnson", and "Sam (Johnson)". Additionally, the characters 5239 # surrounding any match must be of a different type than the adjacent 5240 # characters within the word, so letters must be next to non-letters and 5241 # digits next to non-digits. For example, the dictionary word "jen" will 5242 # match the first three letters of the text "jen123" but will return no 5243 # matches for "jennifer". 5244 # 5245 # Dictionary words containing a large number of characters that are not 5246 # letters or digits may result in unexpected findings because such characters 5247 # are treated as whitespace. The 5248 # [limits](https://cloud.google.com/dlp/limits) page contains details about 5249 # the size limits of dictionaries. For dictionaries that do not fit within 5250 # these constraints, consider using `LargeCustomDictionaryConfig` in the 5251 # `StoredInfoType` API. 5252 "wordList": { # Message defining a list of words or phrases to search for in the data. # List of words or phrases to search for. 5253 "words": [ # Words or phrases defining the dictionary. The dictionary must contain 5254 # at least one phrase and every phrase must contain at least 2 characters 5255 # that are letters or digits. [required] 5256 "A String", 5257 ], 5258 }, 5259 "cloudStoragePath": { # Message representing a single file or path in Cloud Storage. # Newline-delimited file of words in Cloud Storage. Only a single file 5260 # is accepted. 5261 "path": "A String", # A url representing a file or path (no wildcards) in Cloud Storage. 5262 # Example: gs://[BUCKET_NAME]/dictionary.txt 5263 }, 5264 }, 5265 "matchingType": "A String", # How the rule is applied, see MatchingType documentation for details. 5266 }, 5267 }, 5268 ], 5269 "infoTypes": [ # List of infoTypes this rule set is applied to. 5270 { # Type of information detected by the API. 5271 "name": "A String", # Name of the information type. Either a name of your choosing when 5272 # creating a CustomInfoType, or one of the names listed 5273 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 5274 # a built-in type. InfoType names should conform to the pattern 5275 # [a-zA-Z0-9_]{1,64}. 5276 }, 5277 ], 5278 }, 5279 ], 5280 "contentOptions": [ # List of options defining data content to scan. 5281 # If empty, text, images, and other content will be included. 5282 "A String", 5283 ], 5284 "infoTypes": [ # Restricts what info_types to look for. The values must correspond to 5285 # InfoType values returned by ListInfoTypes or listed at 5286 # https://cloud.google.com/dlp/docs/infotypes-reference. 5287 # 5288 # When no InfoTypes or CustomInfoTypes are specified in a request, the 5289 # system may automatically choose what detectors to run. By default this may 5290 # be all types, but may change over time as detectors are updated. 5291 # 5292 # The special InfoType name "ALL_BASIC" can be used to trigger all detectors, 5293 # but may change over time as new InfoTypes are added. If you need precise 5294 # control and predictability as to what detectors are run you should specify 5295 # specific InfoTypes listed in the reference. 5296 { # Type of information detected by the API. 5297 "name": "A String", # Name of the information type. Either a name of your choosing when 5298 # creating a CustomInfoType, or one of the names listed 5299 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 5300 # a built-in type. InfoType names should conform to the pattern 5301 # [a-zA-Z0-9_]{1,64}. 5302 }, 5303 ], 5304 }, 5305 "inspectTemplateName": "A String", # If provided, will be used as the default for all values in InspectConfig. 5306 # `inspect_config` will be merged into the values persisted as part of the 5307 # template. 5308 "actions": [ # Actions to execute at the completion of the job. 5309 { # A task to execute on the completion of a job. 5310 # See https://cloud.google.com/dlp/docs/concepts-actions to learn more. 5311 "saveFindings": { # If set, the detailed findings will be persisted to the specified # Save resulting findings in a provided location. 5312 # OutputStorageConfig. Only a single instance of this action can be 5313 # specified. 5314 # Compatible with: Inspect, Risk 5315 "outputConfig": { # Cloud repository for storing output. 5316 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Store findings in an existing table or a new table in an existing 5317 # dataset. If table_id is not set a new one will be generated 5318 # for you with the following format: 5319 # dlp_googleapis_yyyy_mm_dd_[dlp_job_id]. Pacific timezone will be used for 5320 # generating the date details. 5321 # 5322 # For Inspect, each column in an existing output table must have the same 5323 # name, type, and mode of a field in the `Finding` object. 5324 # 5325 # For Risk, an existing output table should be the output of a previous 5326 # Risk analysis job run on the same source table, with the same privacy 5327 # metric and quasi-identifiers. Risk jobs that analyze the same table but 5328 # compute a different privacy metric, or use different sets of 5329 # quasi-identifiers, cannot store their results in the same table. 5330 # identified by its project_id, dataset_id, and table_name. Within a query 5331 # a table is often referenced with a string in the format of: 5332 # `<project_id>:<dataset_id>.<table_id>` or 5333 # `<project_id>.<dataset_id>.<table_id>`. 5334 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. 5335 # If omitted, project ID is inferred from the API call. 5336 "tableId": "A String", # Name of the table. 5337 "datasetId": "A String", # Dataset ID of the table. 5338 }, 5339 "outputSchema": "A String", # Schema used for writing the findings for Inspect jobs. This field is only 5340 # used for Inspect and must be unspecified for Risk jobs. Columns are derived 5341 # from the `Finding` object. If appending to an existing table, any columns 5342 # from the predefined schema that are missing will be added. No columns in 5343 # the existing table will be deleted. 5344 # 5345 # If unspecified, then all available columns will be used for a new table or 5346 # an (existing) table with no schema, and no changes will be made to an 5347 # existing table that has a schema. 5348 }, 5349 }, 5350 "jobNotificationEmails": { # Enable email notification to project owners and editors on jobs's # Enable email notification to project owners and editors on job's 5351 # completion/failure. 5352 # completion/failure. 5353 }, 5354 "publishSummaryToCscc": { # Publish the result summary of a DlpJob to the Cloud Security # Publish summary to Cloud Security Command Center (Alpha). 5355 # Command Center (CSCC Alpha). 5356 # This action is only available for projects which are parts of 5357 # an organization and whitelisted for the alpha Cloud Security Command 5358 # Center. 5359 # The action will publish count of finding instances and their info types. 5360 # The summary of findings will be persisted in CSCC and are governed by CSCC 5361 # service-specific policy, see https://cloud.google.com/terms/service-terms 5362 # Only a single instance of this action can be specified. 5363 # Compatible with: Inspect 5364 }, 5365 "pubSub": { # Publish a message into given Pub/Sub topic when DlpJob has completed. The # Publish a notification to a pubsub topic. 5366 # message contains a single field, `DlpJobName`, which is equal to the 5367 # finished job's 5368 # [`DlpJob.name`](/dlp/docs/reference/rest/v2/projects.dlpJobs#DlpJob). 5369 # Compatible with: Inspect, Risk 5370 "topic": "A String", # Cloud Pub/Sub topic to send notifications to. The topic must have given 5371 # publishing access rights to the DLP API service account executing 5372 # the long running DlpJob sending the notifications. 5373 # Format is projects/{project}/topics/{topic}. 5374 }, 5375 }, 5376 ], 5377 }, 5378 }, 5379 "result": { # All result fields mentioned below are updated while the job is processing. # A summary of the outcome of this inspect job. 5380 "infoTypeStats": [ # Statistics of how many instances of each info type were found during 5381 # inspect job. 5382 { # Statistics regarding a specific InfoType. 5383 "count": "A String", # Number of findings for this infoType. 5384 "infoType": { # Type of information detected by the API. # The type of finding this stat is for. 5385 "name": "A String", # Name of the information type. Either a name of your choosing when 5386 # creating a CustomInfoType, or one of the names listed 5387 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 5388 # a built-in type. InfoType names should conform to the pattern 5389 # [a-zA-Z0-9_]{1,64}. 5390 }, 5391 }, 5392 ], 5393 "totalEstimatedBytes": "A String", # Estimate of the number of bytes to process. 5394 "processedBytes": "A String", # Total size in bytes that were processed. 5395 }, 5396 }, 5397 "riskDetails": { # Result of a risk analysis operation request. # Results from analyzing risk of a data source. 5398 "numericalStatsResult": { # Result of the numerical stats computation. 5399 "quantileValues": [ # List of 99 values that partition the set of field values into 100 equal 5400 # sized buckets. 5401 { # Set of primitive values supported by the system. 5402 # Note that for the purposes of inspection or transformation, the number 5403 # of bytes considered to comprise a 'Value' is based on its representation 5404 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 5405 # 123456789, the number of bytes would be counted as 9, even though an 5406 # int64 only holds up to 8 bytes of data. 5407 "floatValue": 3.14, 5408 "timestampValue": "A String", 5409 "dayOfWeekValue": "A String", 5410 "timeValue": { # Represents a time of day. The date and time zone are either not significant 5411 # or are specified elsewhere. An API may choose to allow leap seconds. Related 5412 # types are google.type.Date and `google.protobuf.Timestamp`. 5413 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 5414 # to allow the value "24:00:00" for scenarios like business closing time. 5415 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 5416 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 5417 # allow the value 60 if it allows leap-seconds. 5418 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 5419 }, 5420 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 5421 # and time zone are either specified elsewhere or are not significant. The date 5422 # is relative to the Proleptic Gregorian Calendar. This can represent: 5423 # 5424 # * A full date, with non-zero year, month and day values 5425 # * A month and day value, with a zero year, e.g. an anniversary 5426 # * A year on its own, with zero month and day values 5427 # * A year and month value, with a zero day, e.g. a credit card expiration date 5428 # 5429 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 5430 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 5431 # a year. 5432 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 5433 # if specifying a year by itself or a year and month where the day is not 5434 # significant. 5435 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 5436 # month and day. 5437 }, 5438 "stringValue": "A String", 5439 "booleanValue": True or False, 5440 "integerValue": "A String", 5441 }, 5442 ], 5443 "maxValue": { # Set of primitive values supported by the system. # Maximum value appearing in the column. 5444 # Note that for the purposes of inspection or transformation, the number 5445 # of bytes considered to comprise a 'Value' is based on its representation 5446 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 5447 # 123456789, the number of bytes would be counted as 9, even though an 5448 # int64 only holds up to 8 bytes of data. 5449 "floatValue": 3.14, 5450 "timestampValue": "A String", 5451 "dayOfWeekValue": "A String", 5452 "timeValue": { # Represents a time of day. The date and time zone are either not significant 5453 # or are specified elsewhere. An API may choose to allow leap seconds. Related 5454 # types are google.type.Date and `google.protobuf.Timestamp`. 5455 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 5456 # to allow the value "24:00:00" for scenarios like business closing time. 5457 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 5458 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 5459 # allow the value 60 if it allows leap-seconds. 5460 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 5461 }, 5462 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 5463 # and time zone are either specified elsewhere or are not significant. The date 5464 # is relative to the Proleptic Gregorian Calendar. This can represent: 5465 # 5466 # * A full date, with non-zero year, month and day values 5467 # * A month and day value, with a zero year, e.g. an anniversary 5468 # * A year on its own, with zero month and day values 5469 # * A year and month value, with a zero day, e.g. a credit card expiration date 5470 # 5471 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 5472 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 5473 # a year. 5474 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 5475 # if specifying a year by itself or a year and month where the day is not 5476 # significant. 5477 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 5478 # month and day. 5479 }, 5480 "stringValue": "A String", 5481 "booleanValue": True or False, 5482 "integerValue": "A String", 5483 }, 5484 "minValue": { # Set of primitive values supported by the system. # Minimum value appearing in the column. 5485 # Note that for the purposes of inspection or transformation, the number 5486 # of bytes considered to comprise a 'Value' is based on its representation 5487 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 5488 # 123456789, the number of bytes would be counted as 9, even though an 5489 # int64 only holds up to 8 bytes of data. 5490 "floatValue": 3.14, 5491 "timestampValue": "A String", 5492 "dayOfWeekValue": "A String", 5493 "timeValue": { # Represents a time of day. The date and time zone are either not significant 5494 # or are specified elsewhere. An API may choose to allow leap seconds. Related 5495 # types are google.type.Date and `google.protobuf.Timestamp`. 5496 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 5497 # to allow the value "24:00:00" for scenarios like business closing time. 5498 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 5499 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 5500 # allow the value 60 if it allows leap-seconds. 5501 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 5502 }, 5503 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 5504 # and time zone are either specified elsewhere or are not significant. The date 5505 # is relative to the Proleptic Gregorian Calendar. This can represent: 5506 # 5507 # * A full date, with non-zero year, month and day values 5508 # * A month and day value, with a zero year, e.g. an anniversary 5509 # * A year on its own, with zero month and day values 5510 # * A year and month value, with a zero day, e.g. a credit card expiration date 5511 # 5512 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 5513 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 5514 # a year. 5515 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 5516 # if specifying a year by itself or a year and month where the day is not 5517 # significant. 5518 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 5519 # month and day. 5520 }, 5521 "stringValue": "A String", 5522 "booleanValue": True or False, 5523 "integerValue": "A String", 5524 }, 5525 }, 5526 "kMapEstimationResult": { # Result of the reidentifiability analysis. Note that these results are an 5527 # estimation, not exact values. 5528 "kMapEstimationHistogram": [ # The intervals [min_anonymity, max_anonymity] do not overlap. If a value 5529 # doesn't correspond to any such interval, the associated frequency is 5530 # zero. For example, the following records: 5531 # {min_anonymity: 1, max_anonymity: 1, frequency: 17} 5532 # {min_anonymity: 2, max_anonymity: 3, frequency: 42} 5533 # {min_anonymity: 5, max_anonymity: 10, frequency: 99} 5534 # mean that there are no record with an estimated anonymity of 4, 5, or 5535 # larger than 10. 5536 { # A KMapEstimationHistogramBucket message with the following values: 5537 # min_anonymity: 3 5538 # max_anonymity: 5 5539 # frequency: 42 5540 # means that there are 42 records whose quasi-identifier values correspond 5541 # to 3, 4 or 5 people in the overlying population. An important particular 5542 # case is when min_anonymity = max_anonymity = 1: the frequency field then 5543 # corresponds to the number of uniquely identifiable records. 5544 "bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total 5545 # number of classes returned per bucket is capped at 20. 5546 { # A tuple of values for the quasi-identifier columns. 5547 "estimatedAnonymity": "A String", # The estimated anonymity for these quasi-identifier values. 5548 "quasiIdsValues": [ # The quasi-identifier values. 5549 { # Set of primitive values supported by the system. 5550 # Note that for the purposes of inspection or transformation, the number 5551 # of bytes considered to comprise a 'Value' is based on its representation 5552 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 5553 # 123456789, the number of bytes would be counted as 9, even though an 5554 # int64 only holds up to 8 bytes of data. 5555 "floatValue": 3.14, 5556 "timestampValue": "A String", 5557 "dayOfWeekValue": "A String", 5558 "timeValue": { # Represents a time of day. The date and time zone are either not significant 5559 # or are specified elsewhere. An API may choose to allow leap seconds. Related 5560 # types are google.type.Date and `google.protobuf.Timestamp`. 5561 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 5562 # to allow the value "24:00:00" for scenarios like business closing time. 5563 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 5564 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 5565 # allow the value 60 if it allows leap-seconds. 5566 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 5567 }, 5568 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 5569 # and time zone are either specified elsewhere or are not significant. The date 5570 # is relative to the Proleptic Gregorian Calendar. This can represent: 5571 # 5572 # * A full date, with non-zero year, month and day values 5573 # * A month and day value, with a zero year, e.g. an anniversary 5574 # * A year on its own, with zero month and day values 5575 # * A year and month value, with a zero day, e.g. a credit card expiration date 5576 # 5577 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 5578 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 5579 # a year. 5580 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 5581 # if specifying a year by itself or a year and month where the day is not 5582 # significant. 5583 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 5584 # month and day. 5585 }, 5586 "stringValue": "A String", 5587 "booleanValue": True or False, 5588 "integerValue": "A String", 5589 }, 5590 ], 5591 }, 5592 ], 5593 "minAnonymity": "A String", # Always positive. 5594 "bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket. 5595 "maxAnonymity": "A String", # Always greater than or equal to min_anonymity. 5596 "bucketSize": "A String", # Number of records within these anonymity bounds. 5597 }, 5598 ], 5599 }, 5600 "kAnonymityResult": { # Result of the k-anonymity computation. 5601 "equivalenceClassHistogramBuckets": [ # Histogram of k-anonymity equivalence classes. 5602 { 5603 "bucketValues": [ # Sample of equivalence classes in this bucket. The total number of 5604 # classes returned per bucket is capped at 20. 5605 { # The set of columns' values that share the same ldiversity value 5606 "quasiIdsValues": [ # Set of values defining the equivalence class. One value per 5607 # quasi-identifier column in the original KAnonymity metric message. 5608 # The order is always the same as the original request. 5609 { # Set of primitive values supported by the system. 5610 # Note that for the purposes of inspection or transformation, the number 5611 # of bytes considered to comprise a 'Value' is based on its representation 5612 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 5613 # 123456789, the number of bytes would be counted as 9, even though an 5614 # int64 only holds up to 8 bytes of data. 5615 "floatValue": 3.14, 5616 "timestampValue": "A String", 5617 "dayOfWeekValue": "A String", 5618 "timeValue": { # Represents a time of day. The date and time zone are either not significant 5619 # or are specified elsewhere. An API may choose to allow leap seconds. Related 5620 # types are google.type.Date and `google.protobuf.Timestamp`. 5621 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 5622 # to allow the value "24:00:00" for scenarios like business closing time. 5623 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 5624 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 5625 # allow the value 60 if it allows leap-seconds. 5626 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 5627 }, 5628 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 5629 # and time zone are either specified elsewhere or are not significant. The date 5630 # is relative to the Proleptic Gregorian Calendar. This can represent: 5631 # 5632 # * A full date, with non-zero year, month and day values 5633 # * A month and day value, with a zero year, e.g. an anniversary 5634 # * A year on its own, with zero month and day values 5635 # * A year and month value, with a zero day, e.g. a credit card expiration date 5636 # 5637 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 5638 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 5639 # a year. 5640 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 5641 # if specifying a year by itself or a year and month where the day is not 5642 # significant. 5643 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 5644 # month and day. 5645 }, 5646 "stringValue": "A String", 5647 "booleanValue": True or False, 5648 "integerValue": "A String", 5649 }, 5650 ], 5651 "equivalenceClassSize": "A String", # Size of the equivalence class, for example number of rows with the 5652 # above set of values. 5653 }, 5654 ], 5655 "bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket. 5656 "equivalenceClassSizeLowerBound": "A String", # Lower bound on the size of the equivalence classes in this bucket. 5657 "equivalenceClassSizeUpperBound": "A String", # Upper bound on the size of the equivalence classes in this bucket. 5658 "bucketSize": "A String", # Total number of equivalence classes in this bucket. 5659 }, 5660 ], 5661 }, 5662 "lDiversityResult": { # Result of the l-diversity computation. 5663 "sensitiveValueFrequencyHistogramBuckets": [ # Histogram of l-diversity equivalence class sensitive value frequencies. 5664 { 5665 "bucketValues": [ # Sample of equivalence classes in this bucket. The total number of 5666 # classes returned per bucket is capped at 20. 5667 { # The set of columns' values that share the same ldiversity value. 5668 "numDistinctSensitiveValues": "A String", # Number of distinct sensitive values in this equivalence class. 5669 "quasiIdsValues": [ # Quasi-identifier values defining the k-anonymity equivalence 5670 # class. The order is always the same as the original request. 5671 { # Set of primitive values supported by the system. 5672 # Note that for the purposes of inspection or transformation, the number 5673 # of bytes considered to comprise a 'Value' is based on its representation 5674 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 5675 # 123456789, the number of bytes would be counted as 9, even though an 5676 # int64 only holds up to 8 bytes of data. 5677 "floatValue": 3.14, 5678 "timestampValue": "A String", 5679 "dayOfWeekValue": "A String", 5680 "timeValue": { # Represents a time of day. The date and time zone are either not significant 5681 # or are specified elsewhere. An API may choose to allow leap seconds. Related 5682 # types are google.type.Date and `google.protobuf.Timestamp`. 5683 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 5684 # to allow the value "24:00:00" for scenarios like business closing time. 5685 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 5686 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 5687 # allow the value 60 if it allows leap-seconds. 5688 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 5689 }, 5690 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 5691 # and time zone are either specified elsewhere or are not significant. The date 5692 # is relative to the Proleptic Gregorian Calendar. This can represent: 5693 # 5694 # * A full date, with non-zero year, month and day values 5695 # * A month and day value, with a zero year, e.g. an anniversary 5696 # * A year on its own, with zero month and day values 5697 # * A year and month value, with a zero day, e.g. a credit card expiration date 5698 # 5699 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 5700 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 5701 # a year. 5702 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 5703 # if specifying a year by itself or a year and month where the day is not 5704 # significant. 5705 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 5706 # month and day. 5707 }, 5708 "stringValue": "A String", 5709 "booleanValue": True or False, 5710 "integerValue": "A String", 5711 }, 5712 ], 5713 "topSensitiveValues": [ # Estimated frequencies of top sensitive values. 5714 { # A value of a field, including its frequency. 5715 "count": "A String", # How many times the value is contained in the field. 5716 "value": { # Set of primitive values supported by the system. # A value contained in the field in question. 5717 # Note that for the purposes of inspection or transformation, the number 5718 # of bytes considered to comprise a 'Value' is based on its representation 5719 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 5720 # 123456789, the number of bytes would be counted as 9, even though an 5721 # int64 only holds up to 8 bytes of data. 5722 "floatValue": 3.14, 5723 "timestampValue": "A String", 5724 "dayOfWeekValue": "A String", 5725 "timeValue": { # Represents a time of day. The date and time zone are either not significant 5726 # or are specified elsewhere. An API may choose to allow leap seconds. Related 5727 # types are google.type.Date and `google.protobuf.Timestamp`. 5728 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 5729 # to allow the value "24:00:00" for scenarios like business closing time. 5730 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 5731 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 5732 # allow the value 60 if it allows leap-seconds. 5733 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 5734 }, 5735 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 5736 # and time zone are either specified elsewhere or are not significant. The date 5737 # is relative to the Proleptic Gregorian Calendar. This can represent: 5738 # 5739 # * A full date, with non-zero year, month and day values 5740 # * A month and day value, with a zero year, e.g. an anniversary 5741 # * A year on its own, with zero month and day values 5742 # * A year and month value, with a zero day, e.g. a credit card expiration date 5743 # 5744 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 5745 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 5746 # a year. 5747 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 5748 # if specifying a year by itself or a year and month where the day is not 5749 # significant. 5750 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 5751 # month and day. 5752 }, 5753 "stringValue": "A String", 5754 "booleanValue": True or False, 5755 "integerValue": "A String", 5756 }, 5757 }, 5758 ], 5759 "equivalenceClassSize": "A String", # Size of the k-anonymity equivalence class. 5760 }, 5761 ], 5762 "bucketValueCount": "A String", # Total number of distinct equivalence classes in this bucket. 5763 "bucketSize": "A String", # Total number of equivalence classes in this bucket. 5764 "sensitiveValueFrequencyUpperBound": "A String", # Upper bound on the sensitive value frequencies of the equivalence 5765 # classes in this bucket. 5766 "sensitiveValueFrequencyLowerBound": "A String", # Lower bound on the sensitive value frequencies of the equivalence 5767 # classes in this bucket. 5768 }, 5769 ], 5770 }, 5771 "requestedPrivacyMetric": { # Privacy metric to compute for reidentification risk analysis. # Privacy metric to compute. 5772 "numericalStatsConfig": { # Compute numerical stats over an individual column, including 5773 # min, max, and quantiles. 5774 "field": { # General identifier of a data field in a storage service. # Field to compute numerical stats on. Supported types are 5775 # integer, float, date, datetime, timestamp, time. 5776 "name": "A String", # Name describing the field. 5777 }, 5778 }, 5779 "kMapEstimationConfig": { # Reidentifiability metric. This corresponds to a risk model similar to what 5780 # is called "journalist risk" in the literature, except the attack dataset is 5781 # statistically modeled instead of being perfectly known. This can be done 5782 # using publicly available data (like the US Census), or using a custom 5783 # statistical model (indicated as one or several BigQuery tables), or by 5784 # extrapolating from the distribution of values in the input dataset. 5785 # A column with a semantic tag attached. 5786 "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling. 5787 # Required if no column is tagged with a region-specific InfoType (like 5788 # US_ZIP_5) or a region code. 5789 "quasiIds": [ # Fields considered to be quasi-identifiers. No two columns can have the 5790 # same tag. [required] 5791 { 5792 "field": { # General identifier of a data field in a storage service. # Identifies the column. [required] 5793 "name": "A String", # Name describing the field. 5794 }, 5795 "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must 5796 # indicate an auxiliary table that contains statistical information on 5797 # the possible values of this column (below). 5798 "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public 5799 # dataset as a statistical model of population, if available. We 5800 # currently support US ZIP codes, region codes, ages and genders. 5801 # To programmatically obtain the list of supported InfoTypes, use 5802 # ListInfoTypes with the supported_by=RISK_ANALYSIS filter. 5803 "name": "A String", # Name of the information type. Either a name of your choosing when 5804 # creating a CustomInfoType, or one of the names listed 5805 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 5806 # a built-in type. InfoType names should conform to the pattern 5807 # [a-zA-Z0-9_]{1,64}. 5808 }, 5809 "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from 5810 # the distribution of values in the input data 5811 # empty messages in your APIs. A typical example is to use it as the request 5812 # or the response type of an API method. For instance: 5813 # 5814 # service Foo { 5815 # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); 5816 # } 5817 # 5818 # The JSON representation for `Empty` is empty JSON object `{}`. 5819 }, 5820 }, 5821 ], 5822 "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag 5823 # used to tag a quasi-identifiers column must appear in exactly one column 5824 # of one auxiliary table. 5825 { # An auxiliary table contains statistical information on the relative 5826 # frequency of different quasi-identifiers values. It has one or several 5827 # quasi-identifiers columns, and one column that indicates the relative 5828 # frequency of each quasi-identifier tuple. 5829 # If a tuple is present in the data but not in the auxiliary table, the 5830 # corresponding relative frequency is assumed to be zero (and thus, the 5831 # tuple is highly reidentifiable). 5832 "relativeFrequency": { # General identifier of a data field in a storage service. # The relative frequency column must contain a floating-point number 5833 # between 0 and 1 (inclusive). Null values are assumed to be zero. 5834 # [required] 5835 "name": "A String", # Name describing the field. 5836 }, 5837 "quasiIds": [ # Quasi-identifier columns. [required] 5838 { # A quasi-identifier column has a custom_tag, used to know which column 5839 # in the data corresponds to which column in the statistical model. 5840 "field": { # General identifier of a data field in a storage service. 5841 "name": "A String", # Name describing the field. 5842 }, 5843 "customTag": "A String", 5844 }, 5845 ], 5846 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Auxiliary table location. [required] 5847 # identified by its project_id, dataset_id, and table_name. Within a query 5848 # a table is often referenced with a string in the format of: 5849 # `<project_id>:<dataset_id>.<table_id>` or 5850 # `<project_id>.<dataset_id>.<table_id>`. 5851 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. 5852 # If omitted, project ID is inferred from the API call. 5853 "tableId": "A String", # Name of the table. 5854 "datasetId": "A String", # Dataset ID of the table. 5855 }, 5856 }, 5857 ], 5858 }, 5859 "lDiversityConfig": { # l-diversity metric, used for analysis of reidentification risk. 5860 "sensitiveAttribute": { # General identifier of a data field in a storage service. # Sensitive field for computing the l-value. 5861 "name": "A String", # Name describing the field. 5862 }, 5863 "quasiIds": [ # Set of quasi-identifiers indicating how equivalence classes are 5864 # defined for the l-diversity computation. When multiple fields are 5865 # specified, they are considered a single composite key. 5866 { # General identifier of a data field in a storage service. 5867 "name": "A String", # Name describing the field. 5868 }, 5869 ], 5870 }, 5871 "deltaPresenceEstimationConfig": { # δ-presence metric, used to estimate how likely it is for an attacker to 5872 # figure out that one given individual appears in a de-identified dataset. 5873 # Similarly to the k-map metric, we cannot compute δ-presence exactly without 5874 # knowing the attack dataset, so we use a statistical model instead. 5875 "regionCode": "A String", # ISO 3166-1 alpha-2 region code to use in the statistical modeling. 5876 # Required if no column is tagged with a region-specific InfoType (like 5877 # US_ZIP_5) or a region code. 5878 "quasiIds": [ # Fields considered to be quasi-identifiers. No two fields can have the 5879 # same tag. [required] 5880 { # A column with a semantic tag attached. 5881 "field": { # General identifier of a data field in a storage service. # Identifies the column. [required] 5882 "name": "A String", # Name describing the field. 5883 }, 5884 "customTag": "A String", # A column can be tagged with a custom tag. In this case, the user must 5885 # indicate an auxiliary table that contains statistical information on 5886 # the possible values of this column (below). 5887 "infoType": { # Type of information detected by the API. # A column can be tagged with a InfoType to use the relevant public 5888 # dataset as a statistical model of population, if available. We 5889 # currently support US ZIP codes, region codes, ages and genders. 5890 # To programmatically obtain the list of supported InfoTypes, use 5891 # ListInfoTypes with the supported_by=RISK_ANALYSIS filter. 5892 "name": "A String", # Name of the information type. Either a name of your choosing when 5893 # creating a CustomInfoType, or one of the names listed 5894 # at https://cloud.google.com/dlp/docs/infotypes-reference when specifying 5895 # a built-in type. InfoType names should conform to the pattern 5896 # [a-zA-Z0-9_]{1,64}. 5897 }, 5898 "inferred": { # A generic empty message that you can re-use to avoid defining duplicated # If no semantic tag is indicated, we infer the statistical model from 5899 # the distribution of values in the input data 5900 # empty messages in your APIs. A typical example is to use it as the request 5901 # or the response type of an API method. For instance: 5902 # 5903 # service Foo { 5904 # rpc Bar(google.protobuf.Empty) returns (google.protobuf.Empty); 5905 # } 5906 # 5907 # The JSON representation for `Empty` is empty JSON object `{}`. 5908 }, 5909 }, 5910 ], 5911 "auxiliaryTables": [ # Several auxiliary tables can be used in the analysis. Each custom_tag 5912 # used to tag a quasi-identifiers field must appear in exactly one 5913 # field of one auxiliary table. 5914 { # An auxiliary table containing statistical information on the relative 5915 # frequency of different quasi-identifiers values. It has one or several 5916 # quasi-identifiers columns, and one column that indicates the relative 5917 # frequency of each quasi-identifier tuple. 5918 # If a tuple is present in the data but not in the auxiliary table, the 5919 # corresponding relative frequency is assumed to be zero (and thus, the 5920 # tuple is highly reidentifiable). 5921 "relativeFrequency": { # General identifier of a data field in a storage service. # The relative frequency column must contain a floating-point number 5922 # between 0 and 1 (inclusive). Null values are assumed to be zero. 5923 # [required] 5924 "name": "A String", # Name describing the field. 5925 }, 5926 "quasiIds": [ # Quasi-identifier columns. [required] 5927 { # A quasi-identifier column has a custom_tag, used to know which column 5928 # in the data corresponds to which column in the statistical model. 5929 "field": { # General identifier of a data field in a storage service. 5930 "name": "A String", # Name describing the field. 5931 }, 5932 "customTag": "A String", 5933 }, 5934 ], 5935 "table": { # Message defining the location of a BigQuery table. A table is uniquely # Auxiliary table location. [required] 5936 # identified by its project_id, dataset_id, and table_name. Within a query 5937 # a table is often referenced with a string in the format of: 5938 # `<project_id>:<dataset_id>.<table_id>` or 5939 # `<project_id>.<dataset_id>.<table_id>`. 5940 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. 5941 # If omitted, project ID is inferred from the API call. 5942 "tableId": "A String", # Name of the table. 5943 "datasetId": "A String", # Dataset ID of the table. 5944 }, 5945 }, 5946 ], 5947 }, 5948 "categoricalStatsConfig": { # Compute numerical stats over an individual column, including 5949 # number of distinct values and value count distribution. 5950 "field": { # General identifier of a data field in a storage service. # Field to compute categorical stats on. All column types are 5951 # supported except for arrays and structs. However, it may be more 5952 # informative to use NumericalStats when the field type is supported, 5953 # depending on the data. 5954 "name": "A String", # Name describing the field. 5955 }, 5956 }, 5957 "kAnonymityConfig": { # k-anonymity metric, used for analysis of reidentification risk. 5958 "entityId": { # An entity in a dataset is a field or set of fields that correspond to a # Optional message indicating that multiple rows might be associated to a 5959 # single individual. If the same entity_id is associated to multiple 5960 # quasi-identifier tuples over distinct rows, we consider the entire 5961 # collection of tuples as the composite quasi-identifier. This collection 5962 # is a multiset: the order in which the different tuples appear in the 5963 # dataset is ignored, but their frequency is taken into account. 5964 # 5965 # Important note: a maximum of 1000 rows can be associated to a single 5966 # entity ID. If more rows are associated with the same entity ID, some 5967 # might be ignored. 5968 # single person. For example, in medical records the `EntityId` might be a 5969 # patient identifier, or for financial records it might be an account 5970 # identifier. This message is used when generalizations or analysis must take 5971 # into account that multiple rows correspond to the same entity. 5972 "field": { # General identifier of a data field in a storage service. # Composite key indicating which field contains the entity identifier. 5973 "name": "A String", # Name describing the field. 5974 }, 5975 }, 5976 "quasiIds": [ # Set of fields to compute k-anonymity over. When multiple fields are 5977 # specified, they are considered a single composite key. Structs and 5978 # repeated data types are not supported; however, nested fields are 5979 # supported so long as they are not structs themselves or nested within 5980 # a repeated field. 5981 { # General identifier of a data field in a storage service. 5982 "name": "A String", # Name describing the field. 5983 }, 5984 ], 5985 }, 5986 }, 5987 "categoricalStatsResult": { # Result of the categorical stats computation. 5988 "valueFrequencyHistogramBuckets": [ # Histogram of value frequencies in the column. 5989 { 5990 "bucketValues": [ # Sample of value frequencies in this bucket. The total number of 5991 # values returned per bucket is capped at 20. 5992 { # A value of a field, including its frequency. 5993 "count": "A String", # How many times the value is contained in the field. 5994 "value": { # Set of primitive values supported by the system. # A value contained in the field in question. 5995 # Note that for the purposes of inspection or transformation, the number 5996 # of bytes considered to comprise a 'Value' is based on its representation 5997 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 5998 # 123456789, the number of bytes would be counted as 9, even though an 5999 # int64 only holds up to 8 bytes of data. 6000 "floatValue": 3.14, 6001 "timestampValue": "A String", 6002 "dayOfWeekValue": "A String", 6003 "timeValue": { # Represents a time of day. The date and time zone are either not significant 6004 # or are specified elsewhere. An API may choose to allow leap seconds. Related 6005 # types are google.type.Date and `google.protobuf.Timestamp`. 6006 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 6007 # to allow the value "24:00:00" for scenarios like business closing time. 6008 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 6009 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 6010 # allow the value 60 if it allows leap-seconds. 6011 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 6012 }, 6013 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 6014 # and time zone are either specified elsewhere or are not significant. The date 6015 # is relative to the Proleptic Gregorian Calendar. This can represent: 6016 # 6017 # * A full date, with non-zero year, month and day values 6018 # * A month and day value, with a zero year, e.g. an anniversary 6019 # * A year on its own, with zero month and day values 6020 # * A year and month value, with a zero day, e.g. a credit card expiration date 6021 # 6022 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 6023 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 6024 # a year. 6025 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 6026 # if specifying a year by itself or a year and month where the day is not 6027 # significant. 6028 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 6029 # month and day. 6030 }, 6031 "stringValue": "A String", 6032 "booleanValue": True or False, 6033 "integerValue": "A String", 6034 }, 6035 }, 6036 ], 6037 "bucketValueCount": "A String", # Total number of distinct values in this bucket. 6038 "valueFrequencyUpperBound": "A String", # Upper bound on the value frequency of the values in this bucket. 6039 "valueFrequencyLowerBound": "A String", # Lower bound on the value frequency of the values in this bucket. 6040 "bucketSize": "A String", # Total number of values in this bucket. 6041 }, 6042 ], 6043 }, 6044 "deltaPresenceEstimationResult": { # Result of the δ-presence computation. Note that these results are an 6045 # estimation, not exact values. 6046 "deltaPresenceEstimationHistogram": [ # The intervals [min_probability, max_probability) do not overlap. If a 6047 # value doesn't correspond to any such interval, the associated frequency 6048 # is zero. For example, the following records: 6049 # {min_probability: 0, max_probability: 0.1, frequency: 17} 6050 # {min_probability: 0.2, max_probability: 0.3, frequency: 42} 6051 # {min_probability: 0.3, max_probability: 0.4, frequency: 99} 6052 # mean that there are no record with an estimated probability in [0.1, 0.2) 6053 # nor larger or equal to 0.4. 6054 { # A DeltaPresenceEstimationHistogramBucket message with the following 6055 # values: 6056 # min_probability: 0.1 6057 # max_probability: 0.2 6058 # frequency: 42 6059 # means that there are 42 records for which δ is in [0.1, 0.2). An 6060 # important particular case is when min_probability = max_probability = 1: 6061 # then, every individual who shares this quasi-identifier combination is in 6062 # the dataset. 6063 "bucketValues": [ # Sample of quasi-identifier tuple values in this bucket. The total 6064 # number of classes returned per bucket is capped at 20. 6065 { # A tuple of values for the quasi-identifier columns. 6066 "quasiIdsValues": [ # The quasi-identifier values. 6067 { # Set of primitive values supported by the system. 6068 # Note that for the purposes of inspection or transformation, the number 6069 # of bytes considered to comprise a 'Value' is based on its representation 6070 # as a UTF-8 encoded string. For example, if 'integer_value' is set to 6071 # 123456789, the number of bytes would be counted as 9, even though an 6072 # int64 only holds up to 8 bytes of data. 6073 "floatValue": 3.14, 6074 "timestampValue": "A String", 6075 "dayOfWeekValue": "A String", 6076 "timeValue": { # Represents a time of day. The date and time zone are either not significant 6077 # or are specified elsewhere. An API may choose to allow leap seconds. Related 6078 # types are google.type.Date and `google.protobuf.Timestamp`. 6079 "hours": 42, # Hours of day in 24 hour format. Should be from 0 to 23. An API may choose 6080 # to allow the value "24:00:00" for scenarios like business closing time. 6081 "nanos": 42, # Fractions of seconds in nanoseconds. Must be from 0 to 999,999,999. 6082 "seconds": 42, # Seconds of minutes of the time. Must normally be from 0 to 59. An API may 6083 # allow the value 60 if it allows leap-seconds. 6084 "minutes": 42, # Minutes of hour of day. Must be from 0 to 59. 6085 }, 6086 "dateValue": { # Represents a whole or partial calendar date, e.g. a birthday. The time of day 6087 # and time zone are either specified elsewhere or are not significant. The date 6088 # is relative to the Proleptic Gregorian Calendar. This can represent: 6089 # 6090 # * A full date, with non-zero year, month and day values 6091 # * A month and day value, with a zero year, e.g. an anniversary 6092 # * A year on its own, with zero month and day values 6093 # * A year and month value, with a zero day, e.g. a credit card expiration date 6094 # 6095 # Related types are google.type.TimeOfDay and `google.protobuf.Timestamp`. 6096 "year": 42, # Year of date. Must be from 1 to 9999, or 0 if specifying a date without 6097 # a year. 6098 "day": 42, # Day of month. Must be from 1 to 31 and valid for the year and month, or 0 6099 # if specifying a year by itself or a year and month where the day is not 6100 # significant. 6101 "month": 42, # Month of year. Must be from 1 to 12, or 0 if specifying a year without a 6102 # month and day. 6103 }, 6104 "stringValue": "A String", 6105 "booleanValue": True or False, 6106 "integerValue": "A String", 6107 }, 6108 ], 6109 "estimatedProbability": 3.14, # The estimated probability that a given individual sharing these 6110 # quasi-identifier values is in the dataset. This value, typically called 6111 # δ, is the ratio between the number of records in the dataset with these 6112 # quasi-identifier values, and the total number of individuals (inside 6113 # *and* outside the dataset) with these quasi-identifier values. 6114 # For example, if there are 15 individuals in the dataset who share the 6115 # same quasi-identifier values, and an estimated 100 people in the entire 6116 # population with these values, then δ is 0.15. 6117 }, 6118 ], 6119 "bucketValueCount": "A String", # Total number of distinct quasi-identifier tuple values in this bucket. 6120 "bucketSize": "A String", # Number of records within these probability bounds. 6121 "maxProbability": 3.14, # Always greater than or equal to min_probability. 6122 "minProbability": 3.14, # Between 0 and 1. 6123 }, 6124 ], 6125 }, 6126 "requestedSourceTable": { # Message defining the location of a BigQuery table. A table is uniquely # Input dataset to compute metrics over. 6127 # identified by its project_id, dataset_id, and table_name. Within a query 6128 # a table is often referenced with a string in the format of: 6129 # `<project_id>:<dataset_id>.<table_id>` or 6130 # `<project_id>.<dataset_id>.<table_id>`. 6131 "projectId": "A String", # The Google Cloud Platform project ID of the project containing the table. 6132 # If omitted, project ID is inferred from the API call. 6133 "tableId": "A String", # Name of the table. 6134 "datasetId": "A String", # Dataset ID of the table. 6135 }, 6136 }, 6137 "state": "A String", # State of a job. 6138 "jobTriggerName": "A String", # If created by a job trigger, the resource name of the trigger that 6139 # instantiated the job. 6140 "startTime": "A String", # Time when the job started. 6141 "endTime": "A String", # Time when the job finished. 6142 "type": "A String", # The type of job. 6143 "createTime": "A String", # Time when the job was created. 6144 }, 6145 ], 6146 }</pre> 6147</div> 6148 6149<div class="method"> 6150 <code class="details" id="list_next">list_next(previous_request, previous_response)</code> 6151 <pre>Retrieves the next page of results. 6152 6153Args: 6154 previous_request: The request for the previous page. (required) 6155 previous_response: The response from the request for the previous page. (required) 6156 6157Returns: 6158 A request object that you can call 'execute()' on to request the next 6159 page. Returns None if there are no more items in the collection. 6160 </pre> 6161</div> 6162 6163</body></html>