Skip to content

Latest commit

 

History

History
70 lines (64 loc) · 2.99 KB

File metadata and controls

70 lines (64 loc) · 2.99 KB

RankVectors::Project

Properties

Name Type Description Notes
id String Unique project identifier
name String Project name
domain String Website domain URL
user_id String User who owns the project
prompt String Natural language prompt for crawling [optional]
search_query String Search query for targeted crawling [optional]
sitemap_mode String How to handle sitemaps [optional]
include_subdomains Boolean Whether to include subdomains [optional]
ignore_query_params Boolean Whether to ignore URL query parameters [optional]
max_discovery_depth Integer Maximum crawl depth [optional]
exclude_paths Array<String> Paths to exclude from crawling [optional]
include_paths Array<String> Specific paths to include [optional]
crawl_entire_domain Boolean Whether to crawl the entire domain [optional]
allow_external_links Boolean Whether to allow external links [optional]
max_pages Integer Maximum number of pages to crawl [optional]
crawl_delay Integer Delay between crawl requests (ms) [optional]
crawl_max_concurrency Integer Maximum concurrent crawl requests [optional]
only_main_content Boolean Whether to extract only main content [optional]
custom_headers Hash<String, String> Custom headers for crawling [optional]
wait_for Integer Wait time for page load (ms) [optional]
block_ads Boolean Whether to block ads [optional]
proxy_mode String Proxy mode for crawling [optional]
use_reranking Boolean Whether to use AI reranking [optional]
enable_change_tracking Boolean Whether to enable change tracking [optional]
created_at Time Project creation timestamp
updated_at Time Last update timestamp
_count ProjectCount [optional]

Example

require 'rankvectors'

instance = RankVectors::Project.new(
  id: proj-123,
  name: My Website,
  domain: https://example.com,
  user_id: user-456,
  prompt: Only crawl blog posts and documentation,
  search_query: SEO optimization,
  sitemap_mode: include,
  include_subdomains: true,
  ignore_query_params: true,
  max_discovery_depth: 3,
  exclude_paths: [&quot;/admin&quot;,&quot;/private&quot;],
  include_paths: [&quot;/blog&quot;,&quot;/docs&quot;],
  crawl_entire_domain: false,
  allow_external_links: false,
  max_pages: 100,
  crawl_delay: 1000,
  crawl_max_concurrency: 5,
  only_main_content: true,
  custom_headers: null,
  wait_for: 0,
  block_ads: true,
  proxy_mode: auto,
  use_reranking: true,
  enable_change_tracking: false,
  created_at: 2025-01-15T10:00Z,
  updated_at: 2025-01-15T10:00Z,
  _count: null
)