1# The Art Of Scripting HTTP Requests Using Curl 2 3## Background 4 5 This document assumes that you're familiar with HTML and general networking. 6 7 The increasing amount of applications moving to the web has made "HTTP 8 Scripting" more frequently requested and wanted. To be able to automatically 9 extract information from the web, to fake users, to post or upload data to 10 web servers are all important tasks today. 11 12 Curl is a command line tool for doing all sorts of URL manipulations and 13 transfers, but this particular document will focus on how to use it when 14 doing HTTP requests for fun and profit. I willl assume that you know how to 15 invoke `curl --help` or `curl --manual` to get basic information about it. 16 17 Curl is not written to do everything for you. It makes the requests, it gets 18 the data, it sends data and it retrieves the information. You probably need 19 to glue everything together using some kind of script language or repeated 20 manual invokes. 21 22## The HTTP Protocol 23 24 HTTP is the protocol used to fetch data from web servers. It is a very simple 25 protocol that is built upon TCP/IP. The protocol also allows information to 26 get sent to the server from the client using a few different methods, as will 27 be shown here. 28 29 HTTP is plain ASCII text lines being sent by the client to a server to 30 request a particular action, and then the server replies a few text lines 31 before the actual requested content is sent to the client. 32 33 The client, curl, sends a HTTP request. The request contains a method (like 34 GET, POST, HEAD etc), a number of request headers and sometimes a request 35 body. The HTTP server responds with a status line (indicating if things went 36 well), response headers and most often also a response body. The "body" part 37 is the plain data you requested, like the actual HTML or the image etc. 38 39## See the Protocol 40 41 Using curl's option [`--verbose`](https://curl.haxx.se/docs/manpage.html#-v) 42 (`-v` as a short option) will display what kind of commands curl sends to the 43 server, as well as a few other informational texts. 44 45 `--verbose` is the single most useful option when it comes to debug or even 46 understand the curl<->server interaction. 47 48 Sometimes even `--verbose` is not enough. Then 49 [`--trace`](https://curl.haxx.se/docs/manpage.html#-trace) and 50 [`--trace-ascii`]((https://curl.haxx.se/docs/manpage.html#--trace-ascii) 51 offer even more details as they show **everything** curl sends and 52 receives. Use it like this: 53 54 curl --trace-ascii debugdump.txt http://www.example.com/ 55 56## See the Timing 57 58 Many times you may wonder what exactly is taking all the time, or you just 59 want to know the amount of milliseconds between two points in a transfer. For 60 those, and other similar situations, the 61 [`--trace-time`]((https://curl.haxx.se/docs/manpage.html#--trace-time) option 62 is what you need. It'll prepend the time to each trace output line: 63 64 curl --trace-ascii d.txt --trace-time http://example.com/ 65 66## See the Response 67 68 By default curl sends the response to stdout. You need to redirect it 69 somewhere to avoid that, most often that is done with ` -o` or `-O`. 70 71# URL 72 73## Spec 74 75 The Uniform Resource Locator format is how you specify the address of a 76 particular resource on the Internet. You know these, you've seen URLs like 77 https://curl.haxx.se or https://yourbank.com a million times. RFC 3986 is the 78 canonical spec. And yeah, the formal name is not URL, it is URI. 79 80## Host 81 82 The host name is usually resolved using DNS or your /etc/hosts file to an IP 83 address and that's what curl will communicate with. Alternatively you specify 84 the IP address directly in the URL instead of a name. 85 86 For development and other trying out situations, you can point to a different 87 IP address for a host name than what would otherwise be used, by using curl's 88 [`--resolve`](https://curl.haxx.se/docs/manpage.html#--resolve) option: 89 90 curl --resolve www.example.org:80:127.0.0.1 http://www.example.org/ 91 92## Port number 93 94 Each protocol curl supports operates on a default port number, be it over TCP 95 or in some cases UDP. Normally you don't have to take that into 96 consideration, but at times you run test servers on other ports or 97 similar. Then you can specify the port number in the URL with a colon and a 98 number immediately following the host name. Like when doing HTTP to port 99 1234: 100 101 curl http://www.example.org:1234/ 102 103 The port number you specify in the URL is the number that the server uses to 104 offer its services. Sometimes you may use a local proxy, and then you may 105 need to specify that proxy's port number separately for what curl needs to 106 connect to locally. Like when using a HTTP proxy on port 4321: 107 108 curl --proxy http://proxy.example.org:4321 http://remote.example.org/ 109 110## User name and password 111 112 Some services are setup to require HTTP authentication and then you need to 113 provide name and password which is then transferred to the remote site in 114 various ways depending on the exact authentication protocol used. 115 116 You can opt to either insert the user and password in the URL or you can 117 provide them separately: 118 119 curl http://user:password@example.org/ 120 121 or 122 123 curl -u user:password http://example.org/ 124 125 You need to pay attention that this kind of HTTP authentication is not what 126 is usually done and requested by user-oriented websites these days. They tend 127 to use forms and cookies instead. 128 129## Path part 130 131 The path part is just sent off to the server to request that it sends back 132 the associated response. The path is what is to the right side of the slash 133 that follows the host name and possibly port number. 134 135# Fetch a page 136 137## GET 138 139 The simplest and most common request/operation made using HTTP is to GET a 140 URL. The URL could itself refer to a web page, an image or a file. The client 141 issues a GET request to the server and receives the document it asked for. 142 If you issue the command line 143 144 curl https://curl.haxx.se 145 146 you get a web page returned in your terminal window. The entire HTML document 147 that that URL holds. 148 149 All HTTP replies contain a set of response headers that are normally hidden, 150 use curl's [`--include`](https://curl.haxx.se/docs/manpage.html#-i) (`-i`) 151 option to display them as well as the rest of the document. 152 153## HEAD 154 155 You can ask the remote server for ONLY the headers by using the 156 [`--head`](https://curl.haxx.se/docs/manpage.html#-I) (`-I`) option which 157 will make curl issue a HEAD request. In some special cases servers deny the 158 HEAD method while others still work, which is a particular kind of annoyance. 159 160 The HEAD method is defined and made so that the server returns the headers 161 exactly the way it would do for a GET, but without a body. It means that you 162 may see a `Content-Length:` in the response headers, but there must not be an 163 actual body in the HEAD response. 164 165## Multiple URLs in a single command line 166 167 A single curl command line may involve one or many URLs. The most common case 168 is probably to just use one, but you can specify any amount of URLs. Yes 169 any. No limits. You'll then get requests repeated over and over for all the 170 given URLs. 171 172 Example, send two GETs: 173 174 curl http://url1.example.com http://url2.example.com 175 176 If you use [`--data`](https://curl.haxx.se/docs/manpage.html#-d) to POST to 177 the URL, using multiple URLs means that you send that same POST to all the 178 given URLs. 179 180 Example, send two POSTs: 181 182 curl --data name=curl http://url1.example.com http://url2.example.com 183 184 185## Multiple HTTP methods in a single command line 186 187 Sometimes you need to operate on several URLs in a single command line and do 188 different HTTP methods on each. For this, you'll enjoy the 189 [`--next`](https://curl.haxx.se/docs/manpage.html#-:) option. It is basically 190 a separator that separates a bunch of options from the next. All the URLs 191 before `--next` will get the same method and will get all the POST data 192 merged into one. 193 194 When curl reaches the `--next` on the command line, it'll sort of reset the 195 method and the POST data and allow a new set. 196 197 Perhaps this is best shown with a few examples. To send first a HEAD and then 198 a GET: 199 200 curl -I http://example.com --next http://example.com 201 202 To first send a POST and then a GET: 203 204 curl -d score=10 http://example.com/post.cgi --next http://example.com/results.html 205 206# HTML forms 207 208## Forms explained 209 210 Forms are the general way a website can present a HTML page with fields for 211 the user to enter data in, and then press some kind of 'OK' or 'Submit' 212 button to get that data sent to the server. The server then typically uses 213 the posted data to decide how to act. Like using the entered words to search 214 in a database, or to add the info in a bug tracking system, display the 215 entered address on a map or using the info as a login-prompt verifying that 216 the user is allowed to see what it is about to see. 217 218 Of course there has to be some kind of program on the server end to receive 219 the data you send. You cannot just invent something out of the air. 220 221## GET 222 223 A GET-form uses the method GET, as specified in HTML like: 224 225 <form method="GET" action="junk.cgi"> 226 <input type=text name="birthyear"> 227 <input type=submit name=press value="OK"> 228 </form> 229 230 In your favorite browser, this form will appear with a text box to fill in 231 and a press-button labeled "OK". If you fill in '1905' and press the OK 232 button, your browser will then create a new URL to get for you. The URL will 233 get `junk.cgi?birthyear=1905&press=OK` appended to the path part of the 234 previous URL. 235 236 If the original form was seen on the page `www.example.com/when/birth.html`, 237 the second page you'll get will become 238 `www.example.com/when/junk.cgi?birthyear=1905&press=OK`. 239 240 Most search engines work this way. 241 242 To make curl do the GET form post for you, just enter the expected created 243 URL: 244 245 curl "http://www.example.com/when/junk.cgi?birthyear=1905&press=OK" 246 247## POST 248 249 The GET method makes all input field names get displayed in the URL field of 250 your browser. That's generally a good thing when you want to be able to 251 bookmark that page with your given data, but it is an obvious disadvantage if 252 you entered secret information in one of the fields or if there are a large 253 amount of fields creating a very long and unreadable URL. 254 255 The HTTP protocol then offers the POST method. This way the client sends the 256 data separated from the URL and thus you won't see any of it in the URL 257 address field. 258 259 The form would look very similar to the previous one: 260 261 <form method="POST" action="junk.cgi"> 262 <input type=text name="birthyear"> 263 <input type=submit name=press value=" OK "> 264 </form> 265 266 And to use curl to post this form with the same data filled in as before, we 267 could do it like: 268 269 curl --data "birthyear=1905&press=%20OK%20" http://www.example.com/when.cgi 270 271 This kind of POST will use the Content-Type 272 `application/x-www-form-urlencoded' and is the most widely used POST kind. 273 274 The data you send to the server MUST already be properly encoded, curl will 275 not do that for you. For example, if you want the data to contain a space, 276 you need to replace that space with %20 etc. Failing to comply with this will 277 most likely cause your data to be received wrongly and messed up. 278 279 Recent curl versions can in fact url-encode POST data for you, like this: 280 281 curl --data-urlencode "name=I am Daniel" http://www.example.com 282 283 If you repeat `--data` several times on the command line, curl will 284 concatenate all the given data pieces - and put a `&` symbol between each 285 data segment. 286 287## File Upload POST 288 289 Back in late 1995 they defined an additional way to post data over HTTP. It 290 is documented in the RFC 1867, why this method sometimes is referred to as 291 RFC1867-posting. 292 293 This method is mainly designed to better support file uploads. A form that 294 allows a user to upload a file could be written like this in HTML: 295 296 <form method="POST" enctype='multipart/form-data' action="upload.cgi"> 297 <input type=file name=upload> 298 <input type=submit name=press value="OK"> 299 </form> 300 301 This clearly shows that the Content-Type about to be sent is 302 `multipart/form-data`. 303 304 To post to a form like this with curl, you enter a command line like: 305 306 curl --form upload=@localfilename --form press=OK [URL] 307 308## Hidden Fields 309 310 A very common way for HTML based applications to pass state information 311 between pages is to add hidden fields to the forms. Hidden fields are already 312 filled in, they aren't displayed to the user and they get passed along just 313 as all the other fields. 314 315 A similar example form with one visible field, one hidden field and one 316 submit button could look like: 317 318 <form method="POST" action="foobar.cgi"> 319 <input type=text name="birthyear"> 320 <input type=hidden name="person" value="daniel"> 321 <input type=submit name="press" value="OK"> 322 </form> 323 324 To POST this with curl, you won't have to think about if the fields are 325 hidden or not. To curl they're all the same: 326 327 curl --data "birthyear=1905&press=OK&person=daniel" [URL] 328 329## Figure Out What A POST Looks Like 330 331 When you're about fill in a form and send to a server by using curl instead 332 of a browser, you're of course very interested in sending a POST exactly the 333 way your browser does. 334 335 An easy way to get to see this, is to save the HTML page with the form on 336 your local disk, modify the 'method' to a GET, and press the submit button 337 (you could also change the action URL if you want to). 338 339 You will then clearly see the data get appended to the URL, separated with a 340 `?`-letter as GET forms are supposed to. 341 342# HTTP upload 343 344## PUT 345 346 Perhaps the best way to upload data to a HTTP server is to use PUT. Then 347 again, this of course requires that someone put a program or script on the 348 server end that knows how to receive a HTTP PUT stream. 349 350 Put a file to a HTTP server with curl: 351 352 curl --upload-file uploadfile http://www.example.com/receive.cgi 353 354# HTTP Authentication 355 356## Basic Authentication 357 358 HTTP Authentication is the ability to tell the server your username and 359 password so that it can verify that you're allowed to do the request you're 360 doing. The Basic authentication used in HTTP (which is the type curl uses by 361 default) is **plain text** based, which means it sends username and password 362 only slightly obfuscated, but still fully readable by anyone that sniffs on 363 the network between you and the remote server. 364 365 To tell curl to use a user and password for authentication: 366 367 curl --user name:password http://www.example.com 368 369## Other Authentication 370 371 The site might require a different authentication method (check the headers 372 returned by the server), and then 373 [`--ntlm`](https://curl.haxx.se/docs/manpage.html#--ntlm), 374 [`--digest`](https://curl.haxx.se/docs/manpage.html#--digest), 375 [`--negotiate`](https://curl.haxx.se/docs/manpage.html#--negotiate) or even 376 [`--anyauth`](https://curl.haxx.se/docs/manpage.html#--anyauth) might be 377 options that suit you. 378 379## Proxy Authentication 380 381 Sometimes your HTTP access is only available through the use of a HTTP 382 proxy. This seems to be especially common at various companies. A HTTP proxy 383 may require its own user and password to allow the client to get through to 384 the Internet. To specify those with curl, run something like: 385 386 curl --proxy-user proxyuser:proxypassword curl.haxx.se 387 388 If your proxy requires the authentication to be done using the NTLM method, 389 use [`--proxy-ntlm`](https://curl.haxx.se/docs/manpage.html#--proxy-ntlm), if 390 it requires Digest use 391 [`--proxy-digest`](https://curl.haxx.se/docs/manpage.html#--proxy-digest). 392 393 If you use any one of these user+password options but leave out the password 394 part, curl will prompt for the password interactively. 395 396## Hiding credentials 397 398 Do note that when a program is run, its parameters might be possible to see 399 when listing the running processes of the system. Thus, other users may be 400 able to watch your passwords if you pass them as plain command line 401 options. There are ways to circumvent this. 402 403 It is worth noting that while this is how HTTP Authentication works, very 404 many websites will not use this concept when they provide logins etc. See the 405 Web Login chapter further below for more details on that. 406 407# More HTTP Headers 408 409## Referer 410 411 A HTTP request may include a 'referer' field (yes it is misspelled), which 412 can be used to tell from which URL the client got to this particular 413 resource. Some programs/scripts check the referer field of requests to verify 414 that this wasn't arriving from an external site or an unknown page. While 415 this is a stupid way to check something so easily forged, many scripts still 416 do it. Using curl, you can put anything you want in the referer-field and 417 thus more easily be able to fool the server into serving your request. 418 419 Use curl to set the referer field with: 420 421 curl --referer http://www.example.come http://www.example.com 422 423## User Agent 424 425 Very similar to the referer field, all HTTP requests may set the User-Agent 426 field. It names what user agent (client) that is being used. Many 427 applications use this information to decide how to display pages. Silly web 428 programmers try to make different pages for users of different browsers to 429 make them look the best possible for their particular browsers. They usually 430 also do different kinds of javascript, vbscript etc. 431 432 At times, you will see that getting a page with curl will not return the same 433 page that you see when getting the page with your browser. Then you know it 434 is time to set the User Agent field to fool the server into thinking you're 435 one of those browsers. 436 437 To make curl look like Internet Explorer 5 on a Windows 2000 box: 438 439 curl --user-agent "Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)" [URL] 440 441 Or why not look like you're using Netscape 4.73 on an old Linux box: 442 443 curl --user-agent "Mozilla/4.73 [en] (X11; U; Linux 2.2.15 i686)" [URL] 444 445## Redirects 446 447## Location header 448 449 When a resource is requested from a server, the reply from the server may 450 include a hint about where the browser should go next to find this page, or a 451 new page keeping newly generated output. The header that tells the browser to 452 redirect is `Location:`. 453 454 Curl does not follow `Location:` headers by default, but will simply display 455 such pages in the same manner it displays all HTTP replies. It does however 456 feature an option that will make it attempt to follow the `Location:` 457 pointers. 458 459 To tell curl to follow a Location: 460 461 curl --location http://www.example.com 462 463 If you use curl to POST to a site that immediately redirects you to another 464 page, you can safely use 465 [`--location`](https://curl.haxx.se/docs/manpage.html#-L) (`-L`) and 466 `--data`/`--form` together. curl will only use POST in the first request, and 467 then revert to GET in the following operations. 468 469## Other redirects 470 471 Browser typically support at least two other ways of redirects that curl 472 doesn't: first the html may contain a meta refresh tag that asks the browser 473 to load a specific URL after a set number of seconds, or it may use 474 javascript to do it. 475 476# Cookies 477 478## Cookie Basics 479 480 The way the web browsers do "client side state control" is by using 481 cookies. Cookies are just names with associated contents. The cookies are 482 sent to the client by the server. The server tells the client for what path 483 and host name it wants the cookie sent back, and it also sends an expiration 484 date and a few more properties. 485 486 When a client communicates with a server with a name and path as previously 487 specified in a received cookie, the client sends back the cookies and their 488 contents to the server, unless of course they are expired. 489 490 Many applications and servers use this method to connect a series of requests 491 into a single logical session. To be able to use curl in such occasions, we 492 must be able to record and send back cookies the way the web application 493 expects them. The same way browsers deal with them. 494 495## Cookie options 496 497 The simplest way to send a few cookies to the server when getting a page with 498 curl is to add them on the command line like: 499 500 curl --cookie "name=Daniel" http://www.example.com 501 502 Cookies are sent as common HTTP headers. This is practical as it allows curl 503 to record cookies simply by recording headers. Record cookies with curl by 504 using the [`--dump-header`](https://curl.haxx.se/docs/manpage.html#-D) (`-D`) 505 option like: 506 507 curl --dump-header headers_and_cookies http://www.example.com 508 509 (Take note that the 510 [`--cookie-jar`](https://curl.haxx.se/docs/manpage.html#-c) option described 511 below is a better way to store cookies.) 512 513 Curl has a full blown cookie parsing engine built-in that comes in use if you 514 want to reconnect to a server and use cookies that were stored from a 515 previous connection (or hand-crafted manually to fool the server into 516 believing you had a previous connection). To use previously stored cookies, 517 you run curl like: 518 519 curl --cookie stored_cookies_in_file http://www.example.com 520 521 Curl's "cookie engine" gets enabled when you use the 522 [`--cookie`](https://curl.haxx.se/docs/manpage.html#-b) option. If you only 523 want curl to understand received cookies, use `--cookie` with a file that 524 doesn't exist. Example, if you want to let curl understand cookies from a 525 page and follow a location (and thus possibly send back cookies it received), 526 you can invoke it like: 527 528 curl --cookie nada --location http://www.example.com 529 530 Curl has the ability to read and write cookie files that use the same file 531 format that Netscape and Mozilla once used. It is a convenient way to share 532 cookies between scripts or invokes. The `--cookie` (`-b`) switch 533 automatically detects if a given file is such a cookie file and parses it, 534 and by using the `--cookie-jar` (`-c`) option you'll make curl write a new 535 cookie file at the end of an operation: 536 537 curl --cookie cookies.txt --cookie-jar newcookies.txt \ 538 http://www.example.com 539 540# HTTPS 541 542## HTTPS is HTTP secure 543 544 There are a few ways to do secure HTTP transfers. By far the most common 545 protocol for doing this is what is generally known as HTTPS, HTTP over 546 SSL. SSL encrypts all the data that is sent and received over the network and 547 thus makes it harder for attackers to spy on sensitive information. 548 549 SSL (or TLS as the latest version of the standard is called) offers a 550 truckload of advanced features to allow all those encryptions and key 551 infrastructure mechanisms encrypted HTTP requires. 552 553 Curl supports encrypted fetches when built to use a TLS library and it can be 554 built to use one out of a fairly large set of libraries - `curl -V` will show 555 which one your curl was built to use (if any!). To get a page from a HTTPS 556 server, simply run curl like: 557 558 curl https://secure.example.com 559 560## Certificates 561 562 In the HTTPS world, you use certificates to validate that you are the one 563 you claim to be, as an addition to normal passwords. Curl supports client- 564 side certificates. All certificates are locked with a pass phrase, which you 565 need to enter before the certificate can be used by curl. The pass phrase 566 can be specified on the command line or if not, entered interactively when 567 curl queries for it. Use a certificate with curl on a HTTPS server like: 568 569 curl --cert mycert.pem https://secure.example.com 570 571 curl also tries to verify that the server is who it claims to be, by 572 verifying the server's certificate against a locally stored CA cert 573 bundle. Failing the verification will cause curl to deny the connection. You 574 must then use [`--insecure`](https://curl.haxx.se/docs/manpage.html#-k) 575 (`-k`) in case you want to tell curl to ignore that the server can't be 576 verified. 577 578 More about server certificate verification and ca cert bundles can be read in 579 the [SSLCERTS document](https://curl.haxx.se/docs/sslcerts.html). 580 581 At times you may end up with your own CA cert store and then you can tell 582 curl to use that to verify the server's certificate: 583 584 curl --cacert ca-bundle.pem https://example.com/ 585 586# Custom Request Elements 587 588## Modify method and headers 589 590 Doing fancy stuff, you may need to add or change elements of a single curl 591 request. 592 593 For example, you can change the POST request to a PROPFIND and send the data 594 as `Content-Type: text/xml` (instead of the default Content-Type) like this: 595 596 curl --data "<xml>" --header "Content-Type: text/xml" \ 597 --request PROPFIND example.com 598 599 You can delete a default header by providing one without content. Like you 600 can ruin the request by chopping off the Host: header: 601 602 curl --header "Host:" http://www.example.com 603 604 You can add headers the same way. Your server may want a `Destination:` 605 header, and you can add it: 606 607 curl --header "Destination: http://nowhere" http://example.com 608 609## More on changed methods 610 611 It should be noted that curl selects which methods to use on its own 612 depending on what action to ask for. `-d` will do POST, `-I` will do HEAD and 613 so on. If you use the 614 [`--request`](https://curl.haxx.se/docs/manpage.html#-X) / `-X` option you 615 can change the method keyword curl selects, but you will not modify curl's 616 behavior. This means that if you for example use -d "data" to do a POST, you 617 can modify the method to a `PROPFIND` with `-X` and curl will still think it 618 sends a POST . You can change the normal GET to a POST method by simply 619 adding `-X POST` in a command line like: 620 621 curl -X POST http://example.org/ 622 623 ... but curl will still think and act as if it sent a GET so it won't send 624 any request body etc. 625 626# Web Login 627 628## Some login tricks 629 630 While not strictly just HTTP related, it still causes a lot of people 631 problems so here's the executive run-down of how the vast majority of all 632 login forms work and how to login to them using curl. 633 634 It can also be noted that to do this properly in an automated fashion, you 635 will most certainly need to script things and do multiple curl invokes etc. 636 637 First, servers mostly use cookies to track the logged-in status of the 638 client, so you will need to capture the cookies you receive in the 639 responses. Then, many sites also set a special cookie on the login page (to 640 make sure you got there through their login page) so you should make a habit 641 of first getting the login-form page to capture the cookies set there. 642 643 Some web-based login systems feature various amounts of javascript, and 644 sometimes they use such code to set or modify cookie contents. Possibly they 645 do that to prevent programmed logins, like this manual describes how to... 646 Anyway, if reading the code isn't enough to let you repeat the behavior 647 manually, capturing the HTTP requests done by your browsers and analyzing the 648 sent cookies is usually a working method to work out how to shortcut the 649 javascript need. 650 651 In the actual `<form>` tag for the login, lots of sites fill-in 652 random/session or otherwise secretly generated hidden tags and you may need 653 to first capture the HTML code for the login form and extract all the hidden 654 fields to be able to do a proper login POST. Remember that the contents need 655 to be URL encoded when sent in a normal POST. 656 657# Debug 658 659## Some debug tricks 660 661 Many times when you run curl on a site, you'll notice that the site doesn't 662 seem to respond the same way to your curl requests as it does to your 663 browser's. 664 665 Then you need to start making your curl requests more similar to your 666 browser's requests: 667 668 - Use the `--trace-ascii` option to store fully detailed logs of the requests 669 for easier analyzing and better understanding 670 671 - Make sure you check for and use cookies when needed (both reading with 672 `--cookie` and writing with `--cookie-jar`) 673 674 - Set user-agent (with [`-A`](https://curl.haxx.se/docs/manpage.html#-A)) to 675 one like a recent popular browser does 676 677 - Set referer (with [`-E`](https://curl.haxx.se/docs/manpage.html#-E)) like 678 it is set by the browser 679 680 - If you use POST, make sure you send all the fields and in the same order as 681 the browser does it. 682 683## Check what the browsers do 684 685 A very good helper to make sure you do this right, is the web browsers' 686 developers tools that let you view all headers you send and receive (even 687 when using HTTPS). 688 689 A more raw approach is to capture the HTTP traffic on the network with tools 690 such as Wireshark or tcpdump and check what headers that were sent and 691 received by the browser. (HTTPS forces you to use `SSLKEYLOGFILE` to do 692 that.) 693