App & Browser Testing Made Easy

Give your users a seamless experience by testing on 3000+ real devices and browsers. Don't compromise with emulators and simulators

Get Started free
Home Guide How to use Proxy in Puppeteer?

How to use Proxy in Puppeteer?

By Pawan Kumar, Community Contributor -

Puppeteer is a Node.js library developed by the Chrome team at Google. It provides a high-level API to control and automate headless Chrome or Chromium browsers. Headless browsers don’t have a graphical user interface, allowing them to run in the background without displaying a window. 

  • Puppeteer allows you to interact with web pages programmatically, enabling tasks such as web scraping, automated testing, generating screenshots or PDFs of web pages, automating form submissions, and much more. 
  • It provides a wide range of functionalities, including navigation, DOM manipulation, network interception, and JavaScript execution within the context of a web page.
  • Proxies play a significant role in Puppeteer by enabling you to route your network requests through intermediary servers. 
  • They provide anonymity, bypass restrictions, and facilitate tasks like web scraping and automation. Hence, understanding proxies in Puppeteer is crucial for effective usage. 

In this article, we are covering proxy in Puppeteer that will assist in setting up, debugging, and using advanced Proxy configuration.  

What is Proxy in Puppeteer? 

In Puppeteer, a proxy is a server or an intermediary between the client (Puppeteer) and the target website or resource. It acts as a middleman, forwarding requests and responses between the client and the server. 

  • Puppeteer provides built-in support for using proxies, allowing you to manipulate network traffic and simulate various scenarios during web scraping, automated testing, or other web automation tasks.
  • You can configure a proxy for a Browser or an individual Page instance. 
  • It supports different proxies, including HTTP, HTTPS, and SOCKS5 proxies. 
  • If required, specify a proxy server address, port, username, and password. Puppeteer provides the puppeteer.launch() and puppeteer.connect() methods with options to set up proxies.
  • It also supports rotating proxies, where you can dynamically switch between different proxy servers for each request. This can help overcome rate limits, avoid IP-based restrictions, or scrape data from websites that employ anti-bot measures.

Once a proxy is configured, all requests made by Puppeteer will be routed through the proxy server. You can intercept and modify network requests and responses using the page.setRequestInterception(true) method and request/response events. This allows you to manipulate the traffic, block specific requests, modify headers, or inject scripts as needed.

Importance of Proxy Puppeteer

Proxy usage in Puppeteer is essential for several reasons, making it an important feature for web scraping, automated testing, and other tasks. 

Here are some key reasons proxies are important in Puppeteer:

  • Anonymity and Privacy: Proxies allow you to hide your real IP address and location while accessing websites. This is crucial for web scraping, as it helps prevent your IP from being blocked or blacklisted by websites that restrict excessive requests. By rotating through a pool of proxies, you can distribute your requests and avoid detection, maintaining anonymity and protecting your privacy.
  • Bypassing Restrictions and Bans: Many websites implement IP-based restrictions, blocking or limiting access to certain regions or IP addresses. With proxies, you can overcome these restrictions by routing your requests through proxy servers in different locations. This enables you to scrape data from geo-restricted websites or access content that is otherwise inaccessible from your actual location.
  • Load Balancing and Performance: Using proxies for load balancing can be beneficial when performing intensive web scraping or automated testing tasks. By distributing the requests across multiple proxy servers, you can avoid overloading a single server and ensure a smooth and efficient scraping process. This can help prevent IP bans and maintain a high level of performance.
  • Captcha Solving: Many websites use captchas to defend against automated access. Proxies can help mitigate the impact of captchas by rotating through different IP addresses. By spreading the requests across multiple proxies, the likelihood of encountering captchas for each IP address is reduced, and the captcha-solving load is distributed. This helps improve the efficiency of web scraping and reduces the interruption caused by captchas.
  • Geolocation Testing: Proxies are invaluable when testing websites with location-based services or localized content. You can simulate user interactions from various regions by routing requests through proxies in different geographic locations. This allows you to verify the accuracy of location-based features, test localized website content, and ensure a consistent user experience across different regions.
  • Security Testing: Proxies are crucial in security testing and vulnerability assessments. By intercepting and analyzing network traffic, you can identify potential security vulnerabilities, inspect requests and responses, and detect potential security risks. Proxies enable you to manipulate and modify network traffic, inject custom scripts, and simulate different attack scenarios. This helps ensure the security and integrity of web applications.
  • Scraping Large Websites: Proxies are important to manage the scraping process effectively when scraping large websites. Websites with robust anti-scraping measures may limit the number of requests per IP address or implement rate limiting. By rotating through a pool of proxies, you can distribute the requests and scrape the website without triggering these restrictions. This ensures uninterrupted scraping and improves the efficiency of data collection.

Setting up a Proxy Server

To set up a Puppeteer proxy server, you must launch the browser with the appropriate proxy configuration. Puppeteer provides the –proxy-server flag, which allows you to specify the proxy server address and port.

const puppeteer = require('puppeteer');


(async () => {
const proxyServer = 'your-proxy-server-ip:your-proxy-server-port';

// Launch Puppeteer with proxy configuration
const browser = await puppeteer.launch({
headless: true,
args: [`--proxy-server=${proxyServer}`]
});


const page = await browser.newPage();


// Navigate to a website
await page.goto('https://www.browserstack.com/');


await browser.close();
})()
  • The code above defines the ’ proxyServer’  variable with your desired proxy server’s IP address and port.
  • Next, we launch Puppeteer using the puppeteer.launch() method. Within the args option, we pass the –proxy-server flag followed by the proxy server address and port specified in the proxyServer variable. 
  • This configures Puppeteer to use the specified proxy server for all requests made by Puppeteer.
  • After launching Puppeteer, we create a new page with browser.newPage(). Then, we can navigate any website using page.goto(‘https://www.browserstack.com/’)
  • Replace the URL with the website you want to scrape or automate.

You can continue your scraping or automation tasks within the page context, interacting with the website as needed. Finally, remember to close the browser using browser.close() to terminate the Puppeteer instance.

Note that the code assumes you have Puppeteer installed (npm install puppeteer) and have imported the Puppeteer module (const puppeteer = require(‘puppeteer’)) at the beginning of your script. Replace ‘your-proxy-server-ip:your-proxy-server-port‘ with your desired proxy server’s IP address and port.

By setting up the proxy server in this manner, Puppeteer will route all requests through the specified proxy server, allowing you to perform web scraping or automation tasks with the desired proxy configuration.

IP Rotation with Puppeteer

IP rotation refers to using multiple IP addresses in a rotation to perform web scraping or automation tasks. This technique helps prevent IP blocking or detection by websites and allows for more extensive data collection or automation without being easily identified.

To set up IP rotation with Puppeteer using a proxy server, follow these steps:

  1. Choose a Proxy Server: Select a reliable proxy provider offering rotating IP addresses. Rotating proxies assign a new IP address for each request or after a certain time interval. Choosing a proxy service that provides the desired protocol (HTTP, HTTPS, or SOCKS) and supports IP rotation is essential.
  2. Setting up the Proxy Server: Sign up for the chosen proxy server and obtain the necessary credentials, including the IP address, port, username, password, and authentication method. These credentials will be used to connect to the proxy server.
  3. Testing the Proxy Server: Before integrating the proxy server with Puppeteer, it is advisable to ensure its functionality. You can use tools like cURL or browser extensions like FoxyProxy to verify that you can connect to the proxy server and receive responses.
  4. Creating a new Puppeteer instance with Proxy settings: In your Puppeteer code, use the puppeteer.launch() method to create a new Puppeteer instance with the appropriate proxy settings. Pass the proxy server’s IP address and port as command-line arguments to Puppeteer. Additionally, if the proxy server requires authentication, provide the username and password as part of the arguments.
const puppeteer = require('puppeteer');


(async () => {
const browser = await puppeteer.launch({
args: [
'--proxy-server=your-proxy-server-ip:your-proxy-server-port',
'--proxy-auth=username:password'
]
});


await page.goto('https://www.browserstack.com/');


await browser.close();
})()

Replace ‘your-proxy-server-ip:your-proxy-server-port‘ with the actual IP address and           port of the proxy server. If authentication is required, replace ‘username:password‘ with the appropriate credentials.

Configuring the Proxy Server in Puppeteer: Once the Puppeteer instance is launched with the proxy settings, Puppeteer will automatically route all browser traffic through the specified proxy server. You don’t need to configure the proxy server explicitly for each request in your Puppeteer code. 

This ensures that all HTTP requests made by Puppeteer are sent through the proxy server, achieving IP rotation.

// Example Puppeteer code using the proxy


const page = await browser.newPage();
await page.goto('https://www.browserstack.com/');


// Continue with your automation tasks
// ...

Inside your Puppeteer code, you can use browser.newPage() to create a new page and page.goto() to navigate to the desired website or perform your automation tasks. All requests made by Puppeteer, including subsequent navigation and API calls, will be automatically routed through the proxy server specified during the Puppeteer instance setup.

Following these steps, We can set up IP rotation with Puppeteer using a proxy server. Puppeteer handles the routing of requests through the proxy server, allowing us to achieve IP rotation and perform web scraping or automation tasks while using different IP addresses. 

This helps bypass IP blocking and anti-scraping measures and avoid detection during web data collection or automation.

Troubleshooting Puppeteer Proxy Server Issues

When troubleshooting Puppeteer proxy server issues, there are several steps you can take to identify and resolve common problems. 

Debugging Common Proxy Issues 

There are some points for debugging Common proxy issues: 

1. Check the proxy configuration

  • Verify that you have correctly set up the proxy server in your Puppeteer code. If required, ensure you use the correct proxy address, port, and authentication credentials.
  • Double-check that you’re passing the proxy options correctly to the puppeteer.launch() function.

2. Test proxy connectivity

  • Attempt to connect to the proxy server using other tools or command-line utilities like curl or telnet. This will help you determine whether the issue is with Puppeteer or the proxy server.
  • Ensure the proxy server runs and no network or firewall restrictions prevent your connection.

3. Verify proxy response

  • Send a simple HTTP request through the proxy using a curl or browser extension tool. Check if you received the expected response.
  • Inspect the response headers and ensure the proxy server does not add any unexpected headers or modify the content.

4. Debug Puppeteer-specific issues

  • Enable verbose logging in Puppeteer to get more details about the connection and any error messages. You can set the –verbose flag when launching or using puppeteer.launch({ headless: true, devtools: true }) and check the console output.
  • Check for any error messages related to the proxy server or network connections. Look for clues about authentication failures, connection timeouts, or proxy-related errors.

5. Test without a proxy

  • Temporarily remove the proxy configuration in your Puppeteer code and test your script without a proxy server. If the script works correctly, it indicates that the issue lies with the proxy setup.
  • Try using a different proxy server to see if the problem persists. This can help determine if the issue is specific to the proxy server or a more general problem.

Overcoming IP blocking

Overcoming IP blocking in Puppeteer involves implementing techniques such as rotating IP addresses, using residential or data center proxies, and managing request throttling to avoid detection and bypass restrictions imposed by websites. 

There are also some methods for overcoming IP blocking

1. Rotate IP addresses

  • Some websites may block IP addresses frequently used for scraping or crawling. Consider using a proxy rotation service that provides a pool of IP addresses to avoid being blocked.
  • Puppeteer does not have built-in support for IP rotation. Still, you can manually switch proxies between requests or use a library like puppeteer-cluster to manage multiple instances of Puppeteer with different proxies.

2. Use residential or data center proxies

  • Residential proxies use IP addresses associated with real residential connections, while data centers provide data center proxies. Switching between these types of proxies can help bypass certain IP-blocking measures.
  • Residential proxies are generally more expensive but provide better chances of bypassing IP blocks since they appear as regular home connections.

3. Implement request throttling:

  • Websites may flag your IP address if you send too many requests quickly. Implement request throttling or delays between requests to mimic human-like behavior and reduce the risk of IP blocking.

4. Captcha solving:

  • Some websites may challenge requests with captchas when they detect suspicious behavior. Consider using third-party services that specialize in solving captchas to automate this process.

Remember to always comply with the terms of service of the websites you are scraping and respect their rate limits.

Advanced Proxy Configuration

Advanced Proxy Configuration in Puppeteer proxy allows for more granular control and customization when using proxies. Puppeteer’s advanced proxy configuration features offer flexibility and control when working with proxies, enabling effective IP-blocking bypass and enhanced automation capabilities.

Configuring Proxy Authentication

To configure proxy authentication in the advanced proxy configuration in Puppeteer, you can use the puppeteer.launch() method and pass the proxy credentials as options. Here’s an example of how to do it:

const puppeteer = require('puppeteer');


(async () => {
const browser = await puppeteer.launch({
args: [
'--proxy-server=your-proxy-server-ip:your-proxy-server-port',
'--proxy-auth=username:password'
]
});


// Rest of the Puppeteer code goes here


await browser.close();
})();


// Example Puppeteer code using the proxy


const page = await browser.newPage();
await page.goto('https://www.browserstack.com/');


// Continue with your automation tasks
// ...


const puppeteer = require('puppeteer');


(async () => {
const proxyServer = 'http://proxy.example.com:8080';
const proxyUsername = 'your-username';
const proxyPassword = 'your-password';


const browser = await puppeteer.launch({
args: [`--proxy-server=${proxyServer}`],
ignoreHTTPSErrors: true,
});


const page = await browser.newPage();


// Set the proxy credentials
await page.authenticate({
username: proxyUsername,
password: proxyPassword,
});


// Rest of your code...


await browser.close();
})();
  • In the code above, replace ‘proxy.example.com:8080‘ with the actual address and port of your proxy server. Also, provide the ‘your-username‘ and ‘your-password‘ with the appropriate credentials for proxy authentication.
  • By using the page.authenticate() method, you can set the proxy credentials for the specific page in Puppeteer. This ensures that the requests made by that page will go through the proxy server with the provided authentication.
  • Adjust the code according to your specific proxy server and credentials, and integrate it into your Puppeteer workflow as needed.

Using Proxy Chains

To use ProxyChains with Puppeteer, you need to configure ProxyChains to forward the network traffic from Puppeteer through your desired proxy server. 

1. Install ProxyChains

Install ProxyChains on your system by following the instructions for your operating system. ProxyChains is typically available via package managers like apt-get (for Debian-based systems) or brew (for macOS).

2. Configure ProxyChains

  • Open the ProxyChains configuration file (usually located at /etc/proxychains.conf).
  • Uncomment the dynamic_chain line and comment out the strict_chain line if necessary.

Under the [ProxyList] section, add your desired proxy server(s) in the following format:

proxy_type proxy_host proxy_port [proxy_username proxy_password]

3. Launch Puppeteer with ProxyChains:

In your Puppeteer code, you need to launch Puppeteer using the proxychains command, which will intercept the network traffic and forward it through ProxyChains.

Here’s an example code snippet:

const { spawn } = require('child_process');
const puppeteer = require('puppeteer');


(async () => {
// Launch Puppeteer with ProxyChains
const proxychains = spawn('proxychains', ['puppeteer-script.js']);


proxychains.stdout.on('data', (data) => {
console.log(`ProxyChains stdout: ${data}`);
});


proxychains.stderr.on('data', (data) => {
console.error(`ProxyChains stderr: ${data}`);
});


proxychains.on('close', async (code) => {
console.log(`ProxyChains process exited with code ${code}`);


// Perform any cleanup or additional actions
});
})();

4. Write your Puppeteer script

  • Create a separate file (puppeteer-script.js in the example above) containing your Puppeteer code. This code will be executed with ProxyChains.
  • Ensure your Puppeteer script launches the browser and performs any desired actions, utilizing the Puppeteer API as usual.

5. Run the code

  • Execute the above Puppeteer script by running the main script with node, which launches Puppeteer through ProxyChains.
  • Ensure that ProxyChains is configured correctly and the desired proxy server(s) is added to the configuration file.

Following these steps, you can leverage ProxyChains to route your Puppeteer traffic through a proxy server. ProxyChains intercepts the network traffic and forwards it through the configured proxy, allowing you to anonymize and control the requests made by Puppeteer.

Rotating Proxies

Rotating proxies in Puppeteer refers to using a pool of different proxies and switching between them for each request or periodically. This technique helps overcome IP blocking, bypass rate limits, and avoid website detection.

Here’s an example of rotating proxies in Puppeteer code:

const puppeteer = require('puppeteer');
const proxies = ['proxy1.example.com:8080', 'proxy2.example.com:8080', 'proxy3.example.com:8080'];
let currentProxyIndex = 0;


async function makeRequestWithProxy(url) {
// Select the current proxy from the pool
const proxyUrl = proxies[currentProxyIndex];

const browser = await puppeteer.launch({
args: [`--proxy-server=${proxyUrl}`],
});


const page = await browser.newPage();

// Perform actions on the page
await page.goto(url);

// Handle any further actions or scraping logic here

await browser.close();

// Rotate to the next proxy
currentProxyIndex = (currentProxyIndex + 1) % proxies.length;
}


// Example usage
makeRequestWithProxy('https://www.browserstack.com/')
.then(() => {
console.log('Request completed successfully');
})
.catch((error) => {
console.error('Error occurred while making the request:', error);
});
  • The code above has an array ‘proxies’ containing multiple proxy server addresses. The ‘currentProxyIndex’ variable keeps track of the current proxy being used.
  • The ‘makeRequestWithProxy’ function takes a URL as input and performs a request using the current proxy from the pool. It launches a Puppeteer instance with the selected proxy, navigates to the provided URL, and performs any necessary actions or scraping operations.
  • After completing the request, the function closes the browser instance and rotates to the next proxy in the pool using the ‘currentProxyIndex’ variable.

This approach allows you to distribute your requests across different proxies, avoiding IP blocking and rate limits that websites may impose. By rotating proxies, you can improve the success rate of your requests and enhance the reliability of your Puppeteer-based automation or scraping tasks.

Managing Multiple Proxies 

Managing multiple proxies in Puppeteer involves creating multiple instances of Puppeteer, each with its proxy configuration. This approach allows you to distribute requests across different proxies, balancing the load and improving performance. Here’s an example of managing multiple proxies in Puppeteer code:

const puppeteer = require('puppeteer');


const proxies = [
{ proxyUrl: 'http://proxy1.example.com:8080', username: 'user1', password: 'pass1' },
{ proxyUrl: 'http://proxy2.example.com:8080', username: 'user2', password: 'pass2' },
{ proxyUrl: 'http://proxy3.example.com:8080', username: 'user3', password: 'pass3' },
];


async function makeRequestWithProxy(url, proxy) {
const browser = await puppeteer.launch({
args: [`--proxy-server=${proxy.proxyUrl}`],
});


const page = await browser.newPage();


// Set the proxy authentication credentials if required
if (proxy.username && proxy.password) {
await page.authenticate({ username: proxy.username, password: proxy.password });
}


// Perform actions on the page
await page.goto(url);


// Handle any further actions or scraping logic here


await browser.close();
}


// Example usage
const url = 'https://www.browserstack.com/';
proxies.forEach(async (proxy) => {
try {
await makeRequestWithProxy(url, proxy);
console.log(`Request completed successfully using proxy: ${proxy.proxyUrl}`);
} catch (error) {
console.error(`Error occurred while making the request using proxy: ${proxy.proxyUrl}`, error);
}
});
  • In the code above, we have an array proxies containing multiple proxy configurations. Each proxy configuration includes the proxy URL and optional authentication credentials (username and password).
  • The makeRequestWithProxy function takes a URL and a proxy configuration as inputs. It launches a Puppeteer instance with the specified proxy settings, navigates to the provided URL, and performs any necessary actions or scraping operations.
  • If the proxy configuration includes authentication credentials, they are set using the page.authenticate() method before making the request.
  • In the example usage section, we iterate through the proxies array and make requests using each proxy configuration. 
  • The requests are performed asynchronously using forEach and async/await to handle multiple proxies simultaneously.

This approach enables you to manage and distribute requests across multiple proxies in Puppeteer, allowing for load balancing and improved performance. Each request is made with a different proxy configuration, ensuring the requests are distributed among the available proxies.

Debugging in Puppeteer with Proxy

When debugging issues in Puppeteer with proxy configurations, there are several techniques you can use to troubleshoot and identify potential problems. 

  • Enable verbose logging:Launch Puppeteer with the –verbose flag or pass devtools: true as an option. This enables detailed logging in the console, providing insights into the underlying operations and potential errors related to the proxy configuration.
  • Inspect network requests using DevTools: Launch Puppeteer with the devtools: true option and open the Chrome DevTools by using page.waitForDebugger(). This allows you to monitor network requests made by Puppeteer and verify that they are being routed through the proxy.

Look for any error messages or warnings related to the proxy in the Network tab of the DevTools.

  • Check for error responses or status codes: Inspect the HTTP responses received when making requests with the proxy. Look for error status codes (e.g., 4xx or 5xx) that might indicate issues with the proxy server or authentication.
  • Test the proxy separately: Use tools like cURL or browser extensions to test the proxy server independently from Puppeteer. This helps isolate and identify if the issue lies with the proxy server or the Puppeteer code.
  • Validate proxy server settings: Double-check the proxy server configuration, including the proxy URL, port, and authentication credentials (if applicable). Ensure they are correctly provided in the Puppeteer code.
  • Test with a different proxy: Try using a different proxy server to see if the issue persists. This helps determine if the problem is specific to the proxy or a more general configuration issue.
  • Review error messages and exceptions: Monitor the console output and catch any exceptions or error messages related to the proxy configuration. These can provide insights into the specific error that occurred.

By following these steps, you can effectively debug issues related to proxy configurations in Puppeteer. Remember to analyze the console output, inspect network requests, validate proxy settings, and test with alternative proxies to narrow down the potential causes of the problem.

Conclusion 

By mastering these concepts and techniques, you can effectively utilize proxies in Puppeteer, enhance your web automation and scraping capabilities, and overcome various limitations and restrictions imposed by websites or services. 

Testing websites on private networks is tough, and it requires setting up a Proxy Server. However, BrowserStack allows the testing of private websites with its Local Testing feature. Local Testing is also possible when you’re behind a proxy, firewall or VPN.

Run Puppeteer Test from behind Proxies

FAQs

What proxies are best for Puppeteer?

Some popular proxy providers are often used with Puppeteer. Researching and evaluating these providers based on your specific needs is important. Here are a few well-known proxy providers:

  • Luminati: Luminati is a widely used proxy service that offers a large pool of residential IP addresses. They provide HTTP, HTTPS, and SOCKS proxies and offer IP rotation and session control features.
  • Oxylabs: Oxylabs is another reputable proxy provider offering residential and data center proxies. They have a large proxy network and support various proxy types and configurations.
  • Smartproxy: Smartproxy is known for its rotating residential proxies allowing you to request different IP addresses. They offer a user-friendly dashboard and provide proxy solutions for various use cases.
  • ProxyRack: ProxyRack offers residential and data center proxies supporting HTTP, HTTPS, and SOCKS protocols. They provide global proxy coverage and have flexible pricing plans.
  • GeoSurf: GeoSurf specializes in providing geographically targeted proxies, allowing you to access websites as if browsing from specific locations worldwide. They offer both residential and data center proxies.

Remember to consider factors such as pricing, features, reliability, performance, and support when selecting a proxy provider for Puppeteer. It’s also recommended to check reviews, compare providers, and assess their suitability for your use case.

What are 3 examples of proxies?

There are several types of proxies available, each serving different purposes. Here are three common examples of proxies:

  • HTTP Proxy: An HTTP proxy is an intermediary for HTTP (Hypertext Transfer Protocol) requests between a client (a web browser) and a server. It can handle HTTP requests, including GET, POST, and HEAD, and can be used for web browsing, scraping, or accessing web services. HTTP proxies are often used for general web traffic and can provide caching and content-filtering features.
  • HTTPS Proxy: An HTTPS proxy is similar to an HTTP proxy but specifically designed to handle secure HTTPS (Hypertext Transfer Protocol Secure) connections. It intercepts and relays HTTPS requests, allowing for secure communication between clients and servers. HTTPS proxies are commonly used to secure sensitive data transmission and provide an extra layer of encryption.
  • SOCKS Proxy: SOCKS (Socket Secure) proxy is a protocol that operates at a lower level than HTTP and HTTPS proxies. SOCKS proxies can handle various types of network traffic, including TCP (Transmission Control Protocol) and UDP (User Datagram Protocol). They are often used for tasks like torrenting, gaming, or accessing applications that require a flexible and versatile proxy solution.

These are just a few examples of proxies commonly used in various contexts. Choosing the appropriate proxy type is important based on your specific requirements and the protocols involved in your network communication.

Tags
Automation Testing Puppeteer

Featured Articles

Cross Browser Testing in Puppeteer: Tutorial

How to set Proxy in Selenium?

App & Browser Testing Made Easy

Seamlessly test across 20,000+ real devices with BrowserStack