Python连接ClickHouse的TCP与HTTP端口方法详解及常见报错解决方案
一、使用request库
使用 HTTP协议
端口,默认为 8123
这种方法只能获取指定格式的数据
import requests
SSL_VERIFY = False
host = 'http://127.0.0.1:8123' //ip地址及HTTP协议端口
query = 'select * from database.table_names limit 5' //SQL语句
user = ('username','password') //填入账号密码
r = requests.post(host,data = query,auth = user)
print(r.text)
//二进制数据 r.content
//json格式 r.json()
二、使用clickhouse_driver库
1.使用client
使用 TCP协议
端口,默认为 9000
from clickhouse_driver import Client
host = '127.0.0.1' //ip地址
port = 9000 //TCP端口
username = 'username' //用户名
password = 'password' //密码
client = Client(host=host, port=port , database=database,user=user ,password=pw)
query = 'select * from database.table_names limit 5'
result = client.execute(query)
for row in result:
print(row)
2.使用connect
使用 TCP协议
端口,默认为 9000
from clickhouse_driver import connect
host = '127.0.0.1' //ip地址
port = 9000 //TCP端口
username = 'username' //用户名
password = 'password' //密码
conn = connect(host = host, port = port, user = username, password = password)
cursor = conn.cursor()
query = 'select * from database.table_names limit 5'
result = cursor.execute(query)
for row in result:
print(row)
conn.close()
三、使用clickhouse_sqlalchemy库
使用 HTTP协议
端口,默认为 8123
from sqlalchemy import create_engine
from sqlalchemy import text
user = 'username'
password = 'password'
host = '127.0.0.1'
port = '8123'
db = 'default'
connection = f'clickhouse://{user}:{password}@{host}:{port}/{db}'
engine = create_engine(connection, pool_size=100, pool_recycle=3600, pool_timeout=20)
query = text('select * from database.table_names limit 5')
with engine.connect() as connection:
result = connection.execute(query)
for row in result:
print(row)
多条SQL语句执行并打印有输出的语句:
from sqlalchemy import create_engine
from sqlalchemy import text
user = 'username'
password = 'password'
host = '127.0.0.1'
port = '8123'
db = 'default'
connection = f'clickhouse://{user}:{password}@{host}:{port}/{db}'
engine = create_engine(connection, pool_size=100, pool_recycle=3600, pool_timeout=20)
query = ['show databases','use database_name','show tables','select * from database.table_names limit 5']
with engine.connect() as connection:
for sql in query:
result = connection.execute(text(sql))
print(sql)
if sql.startswith('show') or sql.startswith('select'):
rows = result.fetchall()
for row in rows:
print(row)
四、踩坑记录:
1.在使用clickhouse_driver连接的时候报错
clickhouse_driver.dbapi.errors.OperationalError: Code: 102. Unexpected packet from server 127.0.0.1:9000 (expected Hello or Exception, got Unknown packet)
错误原因: 可能是clickhouse的TCP端口不是默认的 9000 。
解决:
(1)找到clickhouse的配置文件config.xml,通常在 /etc/clickhouse-server 下面:
vim /etc/clickhouse-server/configh.xml
(2)找到 <http_port>
和 <tcp_port>
两个标签
<http_port>8123</http_port>
<tcp_port>9118</tcp_port>
其中8123为http端口,9118为tcp端口
2.在使用clickhouse_sqlalchemy连接时报错
clickhouse_sqlalchemy.exceptions.DatabaseException: Orig exception: Code: 516. DB::Exception: default: Authentication failed: password is incorrect, or there is no user with such name.
错误原因:账号不存在或密码错误。
我当时看到这个报错懵懵的,和同事确认了好几次账号密码,后来看了文档
Clickhouse-SQLAlchemy – 数据库相关的记录文档 (hellowac.github.io)
才突然意识到,我的密码是 Aa@123456,密码中有'@',被误认为是连接字符串的分隔符了
,所以才报了密码错误的错。也许user中存在 ‘:’ 也会引起相关问题。
解决:
使用URL编码转换成安全的形式:
encoded_password = password.replace('@', '%40')
完整代码为:
from sqlalchemy import create_engine
from sqlalchemy import text
user = 'username'
password = 'password'
host = '127.0.0.1'
port = '8123'
db = 'default'
encoded_password = password.replace('@', '%40')
connection = f'clickhouse://{user}:{encoded_password}@{host}:{port}/{db}'
engine = create_engine(connection, pool_size=100, pool_recycle=3600, pool_timeout=20)
query = text('select * from database.table_names limit 5')
with engine.connect() as connection:
result = connection.execute(query)
for row in result:
print(row)
参考文档:
https://blog.csdn.net/qq_43965708/article/details/123657894
https://hellowac.github.io/database_doc/clickhouse/clickhouse-sqlalchemy.html
https://blog.csdn.net/vonhehe/article/details/95756567
https://blog.csdn.net/yudiandian2014/article/details/139681112
作者:有好的生发方法记得推荐给我