Title: | A Simple HTTP Database Interface to 'ClickHouse' |
---|---|
Description: | 'ClickHouse' (<https://clickhouse.com/>) is an open-source, high performance columnar OLAP (online analytical processing of queries) database management system for real-time analytics using SQL. This 'DBI' backend relies on the 'ClickHouse' HTTP interface and support HTTPS protocol. |
Authors: | Patrice Godard [aut, cre, cph], Eusebiu Marcu [ctb] |
Maintainer: | Patrice Godard <[email protected]> |
License: | GPL-3 |
Version: | 0.3.3 |
Built: | 2024-11-03 03:50:08 UTC |
Source: | https://github.com/patzaw/clickhousehttp |
Create a ClickHouseHTTP DBI driver
ClickHouseHTTP()
ClickHouseHTTP()
A ClickHouseHTTPDriver
ClickHouseHTTPConnection class.
Send SQL query to ClickHouse
Information about the ClickHouse database
Create a table in ClickHouse
Write a table in ClickHouse
## S4 method for signature 'ClickHouseHTTPConnection,character' dbSendQuery( conn, statement, format = c("Arrow", "TabSeparatedWithNamesAndTypes"), file = NA, ... ) ## S4 method for signature 'ClickHouseHTTPConnection' dbGetInfo(dbObj, ...) ## S4 method for signature 'ClickHouseHTTPConnection' dbCreateTable( conn, name, fields, engine = "TinyLog", overwrite = FALSE, ..., row.names = NULL, temporary = FALSE ) ## S4 method for signature 'ClickHouseHTTPConnection,ANY' dbWriteTable( conn, name, value, overwrite = FALSE, append = FALSE, engine = "TinyLog", ... )
## S4 method for signature 'ClickHouseHTTPConnection,character' dbSendQuery( conn, statement, format = c("Arrow", "TabSeparatedWithNamesAndTypes"), file = NA, ... ) ## S4 method for signature 'ClickHouseHTTPConnection' dbGetInfo(dbObj, ...) ## S4 method for signature 'ClickHouseHTTPConnection' dbCreateTable( conn, name, fields, engine = "TinyLog", overwrite = FALSE, ..., row.names = NULL, temporary = FALSE ) ## S4 method for signature 'ClickHouseHTTPConnection,ANY' dbWriteTable( conn, name, value, overwrite = FALSE, append = FALSE, engine = "TinyLog", ... )
conn |
a ClickHouseHTTPConnection object created with |
statement |
the SQL query statement |
format |
the format used by ClickHouse to send the results. Two formats are supported: "Arrow" (default) and "TabSeparatedWithNamesAndTypes" |
file |
a path to a file to send along the query (default: NA) |
... |
Other parameters passed on to methods |
dbObj |
a ClickHouseHTTPConnection object |
name |
the name of the table to create |
fields |
a character vector with the name of the fields and their
ClickHouse type
(e.g.
|
engine |
the ClickHouse table engine as described in ClickHouse documentation. Examples:
|
overwrite |
if TRUE and if a table with the same name exists, then it is deleted before creating the new one (default: FALSE) |
row.names |
unsupported parameter (add for compatibility reason) |
temporary |
unsupported parameter (add for compatibility reason) |
value |
the table to write |
append |
if TRUE, the values are added to the database table if it exists (default: FALSE). |
Both format have their pros and cons:
Arrow (default):
fast for long tables but slow for wide tables
fast with Array columns
Date and DateTime columns are returned as UInt16 and UInt32 respectively: by default, ClickHouseHTTP interpret them as Date and POSIXct columns but cannot make the difference with actual UInt16 and UInt32
TabSeparatedWithNamesAndTypes:
in general faster than Arrow
fast for wide tables but slow for long tables
slow with Array columns
Special characters are not well interpreted. In such cases, the function below can be useful but can also take time.
.sp_ch_recov <- function(x){ stringi::stri_replace_all_regex( x, c( "\\n", "\\t", "\\r", "\\b", "\\a", "\\f", "\\'", "\\\\" ), c("\n", "\t", "\r", "\b", "\a", "\f", "'", "\\"), vectorize_all=FALSE ) }
A ClickHouseHTTPResult object
A list with the following elements:
name: "ClickHouseHTTPConnection"
db.version: the version of ClickHouse
uptime: ClickHouse uptime
dbname: the default database
username: user name
host: ClickHouse host
port: ClickHouse port
https: Is the connection using HTTPS protocol instead of HTTP
dbCreateTable() returns TRUE, invisibly.
TRUE; called for side effects
## Not run: ## Connection ---- library(DBI) ### HTTP connection ---- con <- dbConnect( ClickHouseHTTP::ClickHouseHTTP(), host="localhost", port=8123 ) ### HTTPS connection (without ssl peer verification) ---- con <- dbConnect( ClickHouseHTTP::ClickHouseHTTP(), host="localhost", port=8443, https=TRUE, ssl_verifypeer=FALSE ) ## Write a table in the database ---- library(dplyr) data("mtcars") mtcars <- as_tibble(mtcars, rownames="car") dbWriteTable(con, "mtcars", mtcars) ## Query the database ---- carsFromDB <- dbReadTable(con, "mtcars") dbGetQuery(con, "SELECT car, mpg, cyl, hp FROM mtcars WHERE hp>=110") ## By default, ClickHouseHTTP relies on the ## Apache Arrow format provided by ClickHouse. ## The `format` argument of the `dbGetQuery()` function can be used to ## rely on the *TabSeparatedWithNamesAndTypes* format. selCars <- dbGetQuery( con, "SELECT car, mpg, cyl, hp FROM mtcars WHERE hp>=110", format="TabSeparatedWithNamesAndTypes" ) ## Identifying the original ClickHouse data types attr(selCars, "type") ## Using alternative databases stored in ClickHouse ---- dbSendQuery(con, "CREATE DATABASE swiss") dbSendQuery(con, "USE swiss") ## The chosen database is used until the session expires. ## It can also be chosen when connecting using the `dbname` argument of ## the `dbConnect()` function. ## The example below shows that spaces in column names are supported. ## It also shows the support of R `list` using the *Array* ClickHouse type. data("swiss") swiss <- as_tibble(swiss, rownames="province") swiss <- mutate(swiss, "pr letters"=strsplit(province, "")) dbWriteTable( con, "swiss", swiss, engine="MergeTree() ORDER BY (Fertility, province)" ) swissFromDB <- dbReadTable(con, "swiss") ## A table from another database can also be accessed as following: dbReadTable(con, SQL("default.mtcars")) ## End(Not run) ## Not run: ## Connection ---- library(DBI) ### HTTP connection ---- con <- dbConnect( ClickHouseHTTP::ClickHouseHTTP(), host="localhost", port=8123 ) ### HTTPS connection (without ssl peer verification) ---- con <- dbConnect( ClickHouseHTTP::ClickHouseHTTP(), host="localhost", port=8443, https=TRUE, ssl_verifypeer=FALSE ) ## Write a table in the database ---- library(dplyr) data("mtcars") mtcars <- as_tibble(mtcars, rownames="car") dbWriteTable(con, "mtcars", mtcars) ## Query the database ---- carsFromDB <- dbReadTable(con, "mtcars") dbGetQuery(con, "SELECT car, mpg, cyl, hp FROM mtcars WHERE hp>=110") ## By default, ClickHouseHTTP relies on the ## Apache Arrow format provided by ClickHouse. ## The `format` argument of the `dbGetQuery()` function can be used to ## rely on the *TabSeparatedWithNamesAndTypes* format. selCars <- dbGetQuery( con, "SELECT car, mpg, cyl, hp FROM mtcars WHERE hp>=110", format="TabSeparatedWithNamesAndTypes" ) ## Identifying the original ClickHouse data types attr(selCars, "type") ## Using alternative databases stored in ClickHouse ---- dbSendQuery(con, "CREATE DATABASE swiss") dbSendQuery(con, "USE swiss") ## The chosen database is used until the session expires. ## It can also be chosen when connecting using the `dbname` argument of ## the `dbConnect()` function. ## The example below shows that spaces in column names are supported. ## It also shows the support of R `list` using the *Array* ClickHouse type. data("swiss") swiss <- as_tibble(swiss, rownames="province") swiss <- mutate(swiss, "pr letters"=strsplit(province, "")) dbWriteTable( con, "swiss", swiss, engine="MergeTree() ORDER BY (Fertility, province)" ) swissFromDB <- dbReadTable(con, "swiss") ## A table from another database can also be accessed as following: dbReadTable(con, SQL("default.mtcars")) ## End(Not run) ## Not run: ## Connection ---- library(DBI) ### HTTP connection ---- con <- dbConnect( ClickHouseHTTP::ClickHouseHTTP(), host="localhost", port=8123 ) ### HTTPS connection (without ssl peer verification) ---- con <- dbConnect( ClickHouseHTTP::ClickHouseHTTP(), host="localhost", port=8443, https=TRUE, ssl_verifypeer=FALSE ) ## Write a table in the database ---- library(dplyr) data("mtcars") mtcars <- as_tibble(mtcars, rownames="car") dbWriteTable(con, "mtcars", mtcars) ## Query the database ---- carsFromDB <- dbReadTable(con, "mtcars") dbGetQuery(con, "SELECT car, mpg, cyl, hp FROM mtcars WHERE hp>=110") ## By default, ClickHouseHTTP relies on the ## Apache Arrow format provided by ClickHouse. ## The `format` argument of the `dbGetQuery()` function can be used to ## rely on the *TabSeparatedWithNamesAndTypes* format. selCars <- dbGetQuery( con, "SELECT car, mpg, cyl, hp FROM mtcars WHERE hp>=110", format="TabSeparatedWithNamesAndTypes" ) ## Identifying the original ClickHouse data types attr(selCars, "type") ## Using alternative databases stored in ClickHouse ---- dbSendQuery(con, "CREATE DATABASE swiss") dbSendQuery(con, "USE swiss") ## The chosen database is used until the session expires. ## It can also be chosen when connecting using the `dbname` argument of ## the `dbConnect()` function. ## The example below shows that spaces in column names are supported. ## It also shows the support of R `list` using the *Array* ClickHouse type. data("swiss") swiss <- as_tibble(swiss, rownames="province") swiss <- mutate(swiss, "pr letters"=strsplit(province, "")) dbWriteTable( con, "swiss", swiss, engine="MergeTree() ORDER BY (Fertility, province)" ) swissFromDB <- dbReadTable(con, "swiss") ## A table from another database can also be accessed as following: dbReadTable(con, SQL("default.mtcars")) ## End(Not run)
## Not run: ## Connection ---- library(DBI) ### HTTP connection ---- con <- dbConnect( ClickHouseHTTP::ClickHouseHTTP(), host="localhost", port=8123 ) ### HTTPS connection (without ssl peer verification) ---- con <- dbConnect( ClickHouseHTTP::ClickHouseHTTP(), host="localhost", port=8443, https=TRUE, ssl_verifypeer=FALSE ) ## Write a table in the database ---- library(dplyr) data("mtcars") mtcars <- as_tibble(mtcars, rownames="car") dbWriteTable(con, "mtcars", mtcars) ## Query the database ---- carsFromDB <- dbReadTable(con, "mtcars") dbGetQuery(con, "SELECT car, mpg, cyl, hp FROM mtcars WHERE hp>=110") ## By default, ClickHouseHTTP relies on the ## Apache Arrow format provided by ClickHouse. ## The `format` argument of the `dbGetQuery()` function can be used to ## rely on the *TabSeparatedWithNamesAndTypes* format. selCars <- dbGetQuery( con, "SELECT car, mpg, cyl, hp FROM mtcars WHERE hp>=110", format="TabSeparatedWithNamesAndTypes" ) ## Identifying the original ClickHouse data types attr(selCars, "type") ## Using alternative databases stored in ClickHouse ---- dbSendQuery(con, "CREATE DATABASE swiss") dbSendQuery(con, "USE swiss") ## The chosen database is used until the session expires. ## It can also be chosen when connecting using the `dbname` argument of ## the `dbConnect()` function. ## The example below shows that spaces in column names are supported. ## It also shows the support of R `list` using the *Array* ClickHouse type. data("swiss") swiss <- as_tibble(swiss, rownames="province") swiss <- mutate(swiss, "pr letters"=strsplit(province, "")) dbWriteTable( con, "swiss", swiss, engine="MergeTree() ORDER BY (Fertility, province)" ) swissFromDB <- dbReadTable(con, "swiss") ## A table from another database can also be accessed as following: dbReadTable(con, SQL("default.mtcars")) ## End(Not run) ## Not run: ## Connection ---- library(DBI) ### HTTP connection ---- con <- dbConnect( ClickHouseHTTP::ClickHouseHTTP(), host="localhost", port=8123 ) ### HTTPS connection (without ssl peer verification) ---- con <- dbConnect( ClickHouseHTTP::ClickHouseHTTP(), host="localhost", port=8443, https=TRUE, ssl_verifypeer=FALSE ) ## Write a table in the database ---- library(dplyr) data("mtcars") mtcars <- as_tibble(mtcars, rownames="car") dbWriteTable(con, "mtcars", mtcars) ## Query the database ---- carsFromDB <- dbReadTable(con, "mtcars") dbGetQuery(con, "SELECT car, mpg, cyl, hp FROM mtcars WHERE hp>=110") ## By default, ClickHouseHTTP relies on the ## Apache Arrow format provided by ClickHouse. ## The `format` argument of the `dbGetQuery()` function can be used to ## rely on the *TabSeparatedWithNamesAndTypes* format. selCars <- dbGetQuery( con, "SELECT car, mpg, cyl, hp FROM mtcars WHERE hp>=110", format="TabSeparatedWithNamesAndTypes" ) ## Identifying the original ClickHouse data types attr(selCars, "type") ## Using alternative databases stored in ClickHouse ---- dbSendQuery(con, "CREATE DATABASE swiss") dbSendQuery(con, "USE swiss") ## The chosen database is used until the session expires. ## It can also be chosen when connecting using the `dbname` argument of ## the `dbConnect()` function. ## The example below shows that spaces in column names are supported. ## It also shows the support of R `list` using the *Array* ClickHouse type. data("swiss") swiss <- as_tibble(swiss, rownames="province") swiss <- mutate(swiss, "pr letters"=strsplit(province, "")) dbWriteTable( con, "swiss", swiss, engine="MergeTree() ORDER BY (Fertility, province)" ) swissFromDB <- dbReadTable(con, "swiss") ## A table from another database can also be accessed as following: dbReadTable(con, SQL("default.mtcars")) ## End(Not run) ## Not run: ## Connection ---- library(DBI) ### HTTP connection ---- con <- dbConnect( ClickHouseHTTP::ClickHouseHTTP(), host="localhost", port=8123 ) ### HTTPS connection (without ssl peer verification) ---- con <- dbConnect( ClickHouseHTTP::ClickHouseHTTP(), host="localhost", port=8443, https=TRUE, ssl_verifypeer=FALSE ) ## Write a table in the database ---- library(dplyr) data("mtcars") mtcars <- as_tibble(mtcars, rownames="car") dbWriteTable(con, "mtcars", mtcars) ## Query the database ---- carsFromDB <- dbReadTable(con, "mtcars") dbGetQuery(con, "SELECT car, mpg, cyl, hp FROM mtcars WHERE hp>=110") ## By default, ClickHouseHTTP relies on the ## Apache Arrow format provided by ClickHouse. ## The `format` argument of the `dbGetQuery()` function can be used to ## rely on the *TabSeparatedWithNamesAndTypes* format. selCars <- dbGetQuery( con, "SELECT car, mpg, cyl, hp FROM mtcars WHERE hp>=110", format="TabSeparatedWithNamesAndTypes" ) ## Identifying the original ClickHouse data types attr(selCars, "type") ## Using alternative databases stored in ClickHouse ---- dbSendQuery(con, "CREATE DATABASE swiss") dbSendQuery(con, "USE swiss") ## The chosen database is used until the session expires. ## It can also be chosen when connecting using the `dbname` argument of ## the `dbConnect()` function. ## The example below shows that spaces in column names are supported. ## It also shows the support of R `list` using the *Array* ClickHouse type. data("swiss") swiss <- as_tibble(swiss, rownames="province") swiss <- mutate(swiss, "pr letters"=strsplit(province, "")) dbWriteTable( con, "swiss", swiss, engine="MergeTree() ORDER BY (Fertility, province)" ) swissFromDB <- dbReadTable(con, "swiss") ## A table from another database can also be accessed as following: dbReadTable(con, SQL("default.mtcars")) ## End(Not run)
Driver for the ClickHouse database using HTTP(S) interface
Connect to a ClickHouse database using the ClickHouseHTTP DBI
## S4 method for signature 'ClickHouseHTTPDriver' dbConnect( drv, host = "localhost", port = 8123L, dbname = "default", user = "default", password = "", https = FALSE, ssl_verifypeer = TRUE, host_path = NA, session_timeout = 3600L, convert_uint = TRUE, extended_headers = list(), reset_handle = FALSE, ... )
## S4 method for signature 'ClickHouseHTTPDriver' dbConnect( drv, host = "localhost", port = 8123L, dbname = "default", user = "default", password = "", https = FALSE, ssl_verifypeer = TRUE, host_path = NA, session_timeout = 3600L, convert_uint = TRUE, extended_headers = list(), reset_handle = FALSE, ... )
drv |
A driver object created by |
host |
name of the database host (default: "localhost") |
port |
port on which the database is listening (default: 8123L) |
dbname |
name of the default database (default: "default") |
user |
user name (default: "default") |
password |
user password (default: "") |
https |
a logical to use the HTTPS protocol (default: FALSE) |
ssl_verifypeer |
a logical to verify SSL certificate when using HTTPS (default: TRUE) |
host_path |
a path to use on host (e.g. "ClickHouse/"): it allows to connect on a server behind a reverse proxy for example |
session_timeout |
timeout in seconds (default: 3600L seconds) |
convert_uint |
a logical: if TRUE (default), UInt ClickHouse data types are converted in the following R classes:
|
extended_headers |
a named list with other HTTP headers
(for example: |
reset_handle |
a logical indicating how to manage Curl handles (see httr::handle_pool). If TRUE, handle reset is used (default: FALSE). |
... |
Other parameters passed on to methods |
A ClickHouseHTTPConnection
## Not run: ## Connection ---- library(DBI) ### HTTP connection ---- con <- dbConnect( ClickHouseHTTP::ClickHouseHTTP(), host="localhost", port=8123 ) ### HTTPS connection (without ssl peer verification) ---- con <- dbConnect( ClickHouseHTTP::ClickHouseHTTP(), host="localhost", port=8443, https=TRUE, ssl_verifypeer=FALSE ) ## Write a table in the database ---- library(dplyr) data("mtcars") mtcars <- as_tibble(mtcars, rownames="car") dbWriteTable(con, "mtcars", mtcars) ## Query the database ---- carsFromDB <- dbReadTable(con, "mtcars") dbGetQuery(con, "SELECT car, mpg, cyl, hp FROM mtcars WHERE hp>=110") ## By default, ClickHouseHTTP relies on the ## Apache Arrow format provided by ClickHouse. ## The `format` argument of the `dbGetQuery()` function can be used to ## rely on the *TabSeparatedWithNamesAndTypes* format. selCars <- dbGetQuery( con, "SELECT car, mpg, cyl, hp FROM mtcars WHERE hp>=110", format="TabSeparatedWithNamesAndTypes" ) ## Identifying the original ClickHouse data types attr(selCars, "type") ## Using alternative databases stored in ClickHouse ---- dbSendQuery(con, "CREATE DATABASE swiss") dbSendQuery(con, "USE swiss") ## The chosen database is used until the session expires. ## It can also be chosen when connecting using the `dbname` argument of ## the `dbConnect()` function. ## The example below shows that spaces in column names are supported. ## It also shows the support of R `list` using the *Array* ClickHouse type. data("swiss") swiss <- as_tibble(swiss, rownames="province") swiss <- mutate(swiss, "pr letters"=strsplit(province, "")) dbWriteTable( con, "swiss", swiss, engine="MergeTree() ORDER BY (Fertility, province)" ) swissFromDB <- dbReadTable(con, "swiss") ## A table from another database can also be accessed as following: dbReadTable(con, SQL("default.mtcars")) ## End(Not run)
## Not run: ## Connection ---- library(DBI) ### HTTP connection ---- con <- dbConnect( ClickHouseHTTP::ClickHouseHTTP(), host="localhost", port=8123 ) ### HTTPS connection (without ssl peer verification) ---- con <- dbConnect( ClickHouseHTTP::ClickHouseHTTP(), host="localhost", port=8443, https=TRUE, ssl_verifypeer=FALSE ) ## Write a table in the database ---- library(dplyr) data("mtcars") mtcars <- as_tibble(mtcars, rownames="car") dbWriteTable(con, "mtcars", mtcars) ## Query the database ---- carsFromDB <- dbReadTable(con, "mtcars") dbGetQuery(con, "SELECT car, mpg, cyl, hp FROM mtcars WHERE hp>=110") ## By default, ClickHouseHTTP relies on the ## Apache Arrow format provided by ClickHouse. ## The `format` argument of the `dbGetQuery()` function can be used to ## rely on the *TabSeparatedWithNamesAndTypes* format. selCars <- dbGetQuery( con, "SELECT car, mpg, cyl, hp FROM mtcars WHERE hp>=110", format="TabSeparatedWithNamesAndTypes" ) ## Identifying the original ClickHouse data types attr(selCars, "type") ## Using alternative databases stored in ClickHouse ---- dbSendQuery(con, "CREATE DATABASE swiss") dbSendQuery(con, "USE swiss") ## The chosen database is used until the session expires. ## It can also be chosen when connecting using the `dbname` argument of ## the `dbConnect()` function. ## The example below shows that spaces in column names are supported. ## It also shows the support of R `list` using the *Array* ClickHouse type. data("swiss") swiss <- as_tibble(swiss, rownames="province") swiss <- mutate(swiss, "pr letters"=strsplit(province, "")) dbWriteTable( con, "swiss", swiss, engine="MergeTree() ORDER BY (Fertility, province)" ) swissFromDB <- dbReadTable(con, "swiss") ## A table from another database can also be accessed as following: dbReadTable(con, SQL("default.mtcars")) ## End(Not run)