Skip to content

Latest commit

 

History

History
60 lines (54 loc) · 2.83 KB

File metadata and controls

60 lines (54 loc) · 2.83 KB

RankVectors::CreateProjectRequest

Properties

Name Type Description Notes
name String Project name
domain String Website domain URL
prompt String Natural language prompt for crawling [optional]
search_query String Search query for targeted crawling [optional]
sitemap_mode String How to handle sitemaps [optional][default to 'include']
include_subdomains Boolean Whether to include subdomains [optional][default to true]
ignore_query_params Boolean Whether to ignore URL query parameters [optional][default to true]
max_discovery_depth Integer Maximum crawl depth [optional]
exclude_paths Array<String> Paths to exclude from crawling [optional]
include_paths Array<String> Specific paths to include [optional]
crawl_entire_domain Boolean Whether to crawl the entire domain [optional][default to false]
allow_external_links Boolean Whether to allow external links [optional][default to false]
max_pages Integer Maximum number of pages to crawl [optional][default to 100]
crawl_delay Integer Delay between crawl requests (ms) [optional]
crawl_max_concurrency Integer Maximum concurrent crawl requests [optional]
only_main_content Boolean Whether to extract only main content [optional][default to true]
custom_headers Hash<String, String> Custom headers for crawling [optional]
wait_for Integer Wait time for page load (ms) [optional][default to 0]
block_ads Boolean Whether to block ads [optional][default to true]
proxy_mode String Proxy mode for crawling [optional][default to 'auto']
use_reranking Boolean Whether to use AI reranking [optional][default to true]
enable_change_tracking Boolean Whether to enable change tracking [optional][default to false]

Example

require 'rankvectors'

instance = RankVectors::CreateProjectRequest.new(
  name: My Website,
  domain: https://example.com,
  prompt: Only crawl blog posts and documentation,
  search_query: SEO optimization,
  sitemap_mode: include,
  include_subdomains: true,
  ignore_query_params: true,
  max_discovery_depth: 3,
  exclude_paths: [&quot;/admin&quot;,&quot;/private&quot;],
  include_paths: [&quot;/blog&quot;,&quot;/docs&quot;],
  crawl_entire_domain: false,
  allow_external_links: false,
  max_pages: 100,
  crawl_delay: 1000,
  crawl_max_concurrency: 5,
  only_main_content: true,
  custom_headers: null,
  wait_for: 0,
  block_ads: true,
  proxy_mode: auto,
  use_reranking: true,
  enable_change_tracking: false
)