基于Flask轻量级Web框架+Python,使用多线程和Selenium爬虫技术来爬取招聘网站岗位信息。
目录
1.基本环境配置
2.内核配置
2.搭建三主两从的k8s集群
3.部署Keepalived和HAproxy高可用集群
4.搭建MySQL数据库,配置主从复制、读写分离
5.部署Flask应用至worker节点
6.搭建NFS共享存储,创建pv、pvc
7.安装内网穿透工具,配置端口转发
8.最终效果
1.基本环境配置
博主个人配置:准备5台具有2核cpu和4GB内存以上的服务器,系统为CentOS7.9.
如果配置不够,也可以只安装一台Master、两台Node。
1.1 所有节点配置主机名、hosts
hostnamectl set-hostname k8s-master0
立即刷新:
systemctl restart systemd-hostnamed
exec bash
vim /etc/hosts: (记得修改ip)
192.168.163.151 k8s-master01
192.168.163.152 k8s-master02
192.168.163.153 k8s-master03
192.168.163.154 k8s-node01
192.168.163.155 k8s-node02
1.2 所有节点Docker、Kubernetes源和默认yum源
curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
yum install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg
https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
sed -i -e '/mirrors.cloud.aliyuncs.com/d' -e '/mirrors.aliyuncs.com/d' /etc/yum.repos.d/CentOS-Base.repo
1.3 所有节点安装一些常用的工具
yum install wget jq psmisc vim net-tools telnet git -y
1.4 所有节点关闭防火墙、SELinux、DNSmasq
systemctl disable --now firewalld
systemctl disable --now dnsmasq
systemctl disable --now NetworkManager
setenforce 0
sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/sysconfig/selinux
sed -i 's#SELINUX=enforcing#SELINUX=disabled#g' /etc/selinux/config
1.5 所有节点关闭Swap分区
swapoff -a && sysctl -w vm.swappiness=0
sed -ri '/^[^#]*swap/s@^@#@' /etc/fstab
.1.6 所有节点安装ntpdate
rpm -ivh http://mirrors.wlnmp.com/centos/wlnmp-release-centos.noarch.rpm
yum install ntpdate -y
同步时间:
ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
echo 'Asia/Shanghai' >/etc/timezone
ntpdate time2.aliyun.com
crontab -e
然后输入: */5 * * * * /usr/sbin/ntpdate time2.aliyun.com
1.7 所有节点配置limit
ulimit -SHn 65535
vim /etc/security/limits.conf
#末尾添加如下内容
soft nofile 65536
hard nofile 131072
soft nproc 65535
hard nproc 655350
soft memlock unlimited
hard memlock unlimited
1.8 Master01节点配置免密钥登录其他节点
ssh-keygen -t rsa
for i in k8s-master01 k8s-master02 k8s-master03 k8s-node01 k8s-node02; do ssh-copy-id -i .ssh/id_rsa.pub $i; done
1.9 Master01下载安装所有的源码文件
cd /root/ ; git clone https://gitee.com/dukuan/k8s-ha-install.git
1.10 所有节点升级系统并重启
yum update -y && reboot
2.内核配置
为了集群的稳定性和兼容性,生产环境的内核最好升级到4.18版本以上,本示例将升级到4.19版本。
2.1 Master01下载离线包
cd /root
wget http://193.49.22.109/elrepo/kernel/el7/x86_64/RPMS/kernel-ml-devel-4.19.12-1.el7.elrepo.x86_64.rpm
wget http://193.49.22.109/elrepo/kernel/el7/x86_64/RPMS/kernel-ml-4.19.12-1.el7.elrepo.x86_64.rpm
2.2 将安装包从Master01节点传到其他节点
for i in k8s-master02 k8s-master03 k8s-node01 k8s-node02; do scp kernel-ml-4.19.12-1.el7.elrepo.x86_64.rpm kernel-ml-devel-4.19.12-1.el7.elrepo.x86_64.rpm $i:/root/ ; done
2.3 所有节点安装内核
cd /root && yum localinstall -y kernel-ml*
2.4 所有节点更改内核启动顺序
grub2-set-default 0 && grub2-mkconfig -o /etc/grub2.cfg
grubby --args="user_namespace.enable=1" --update-kernel="$(grubby --default-kernel)"
2.5 所有节点检查默认内核是不是4.19
grubby --default-kernel
/boot/vmlinuz-4.19.12-1.el7.elrepo.x86_64
2.6 所有节点重启,然后检查内核是不是4.19
reboot
uname -a
2.7 所有节点安装ipvsadm和ipset
yum install ipvsadm ipset sysstat conntrack libseccomp -y
2.8 所有节点配置ipvs模块,在内核4.19+版本nf_conntrack_ipv4已经改为nf_conntrack,4.18以下版本使用nf_conntrack_ipv4即可
vim /etc/modules-load.d/ipvs.conf
#加入以下内容
ip_vs
ip_vs_lc
ip_vs_wlc
ip_vs_rr
ip_vs_wrr
ip_vs_lblc
ip_vs_lblcr
ip_vs_dh
ip_vs_sh
ip_vs_fo
ip_vs_nq
ip_vs_sed
ip_vs_ftp
ip_vs_sh
nf_conntrack # 4.18改为nf_conntrack_ipv4
ip_tables
ip_set
xt_set
ipt_set
ipt_rpfilter
ipt_REJECT
ipip
然后执行
systemctl enable --now systemd-modules-load.service
2.9 开启一些K8s集群中必需的内核参数,所有节点配置K8s内核
cat <<EOF > /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
fs.may_detach_mounts = 1
net.ipv4.conf.all.route_localnet = 1
vm.overcommit_memory=1
vm.panic_on_oom=0
fs.inotify.max_user_watches=89100
fs.file-max=52706963
fs.nr_open=52706963
net.netfilter.nf_conntrack_max=2310720
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_keepalive_intvl =15
net.ipv4.tcp_max_tw_buckets = 36000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_max_orphans = 327680
net.ipv4.tcp_orphan_retries = 3
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.ip_conntrack_max = 65536
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.tcp_timestamps = 0
net.core.somaxconn = 16384
EOF
sysctl --system
2.10 所有节点配置完内核后,重启服务器,保证重启后内核依旧加载
reboot
lsmod | grep --color=auto -e ip_vs -e nf_conntrack
2.搭建三主两从的k8s集群
本节主要安装的是集群中用到的各种组件,比如docker-ce、containerd、Kubernetes组件等。
两种Runtime(运行时):Docker和Containerd
如果你的k8s安装的版本高于1.24(社区计划在1.24版本废弃对dockershim的支持,具体可以通过Kubernetes官方的ChangeLog进行确认),需要使用Containerd作为Kubernetes的Runtime。如果安装的版本低于1.24,选择Docker和Containerd均可。
2.1 安装Runtime(这里我的Kubernetes版本为1.2
sysctl --system
7,选择containerd作为容器运行时。)
因为安装Docker时会自动安装Containerd,并且后面的制作镜像和云厂商镜像仓库也要使用到Docker,所以还是在每个节点安装Docker。
所有节点安装docker-ce-20.10:
yum install docker-ce-20.10.* docker-ce-cli-20.10.* -y
2.2 首先配置Containerd所需的模块(所有节点):
cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF
2..3 所有节点加载模块:
modprobe -- overlay
modprobe -- br_netfilter
2.4所有节点配置Containerd所需的内核:
cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
2.5所有节点加载内核:
sysctl --system
2.6所有节点配置Containerd的配置文件:
mkdir -p /etc/containerd
containerd config default | tee /etc/containerd/config.toml
2.7所有节点将Containerd的Cgroup改为Systemd:
vim /etc/containerd/config.toml
2.8找到containerd.runtimes.runc.options,添加SystemdCgroup = true,如图1.1所示。
SystemdCgroup = true
2.9所有节点将sandbox_image的Pause镜像改成符合自己版本的地址:registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6
registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6
2.10所有节点启动Containerd,并配置开机自启动:
systemctl daemon-reload
systemctl enable --now containerd
2.11 所有节点配置crictl客户端连接的Runtime位置:
cat > /etc/crictl.yaml <<EOF
runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
timeout: 10
debug: false
EOF
2.12 安装Kubernetes的系统组件
查看最新的Kubernetes版本是多少
yum list kubeadm.x86_64 --showduplicates | sort -r
所有节点安装最新版本的kubeadm、kubelet和kubectl
yum install kubeadm-1.27* kubelet-1.27* kubectl-1.27* -y
所有节点设置Kubelet开机自启动
systemctl daemon-reload
systemctl enable --now kubelet
2.13 集群初始化
使用Kubeadm安装集群,需要一个Master节点初始化集群,然后加入其他节点即可。
初始化集群时,可以直接使用Kubeadm命令进行初始化,也可以使用一个配置文件进行初始化,由于使用命令行的形式可能需要配置的字段比较多,因此本示例采用配置文件进行初始化。
Master01节点创建kubeadm-config.yaml配置文件如下(也可以使用如下命令自动生成kubeadm config print init-defaults > kubeadm-config.yaml)
vim kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta3
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: 7t2weq.bjbawusm0jaxury
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.163.151
bindPort: 6443
nodeRegistration:
# criSocket: /var/run/dockershim.sock # 如果是Docker作为Runtime,配置此项
criSocket: /run/containerd/containerd.sock # 如果是Containerd作为Runtime,配置此项
name: k8s-master01
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
kubeletExtraArgs: #k8s组件1.27版本的这部分配置要写在yaml里,写在nodeRegistration里面。
network-plugin: cni
cni-bin-dir: /opt/cni/bin
cni-conf-dir: /etc/cni/net.d
container-runtime: remote
container-runtime-endpoint: unix:///run/containerd/containerd.sock
runtime-request-timeout: 15m
cgroup-driver: systemd
---
apiServer:
certSANs:
- 192.168.163.150
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: 192.168.163.150:16443
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.27.6 # 更改此处的版本号和kubeadm version命令查询的版本号一致
networking:
dnsDomain: cluster.local
podSubnet: 172.16.0.0/12
serviceSubnet: 192.168.0.0/16
scheduler: {}
由于你的版本和此示例可能不太一致,因此需要更新一下kubeadm配置文件(Master01节点操作)
kubeadm config migrate --old-config kubeadm-config.yaml --new-config new.yaml
将new.yaml文件复制到其他Master节点
for i in k8s-master02 k8s-master03; do scp new.yaml $i:/root/; done
之后所有Master节点提前下载镜像,可以节省初始化时间(其他节点不需要更改任何配置,包括IP地址也不需要更改)
kubeadm config images pull --config /root/new.yaml
初始化Master01节点,初始化以后会在/etc/kubernetes目录下生成对应的证书和配置文件,之后其他Master节点加入Master01即可
kubeadm init --config /root/new.yaml --upload-certs
初始化成功以后,会产生Token值(每个人的都不一样,不要复制下面的),用于其他节点加入时使用,因此要记录一下,复制后,将其他Master和Node节点(也称为工作节点、Worker节点)加入集群
kubeadm join 192.168.163.150:16443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:60c196cdaacaed263e5d638...... --control-plane --certificate-key f1411a2765268b0759a3b7a8284f9ce31951cf341aad......
kubeadm join 192.168.163.150:16443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:60c196cdaacaed26e5d638216d59b6b4d1da......
所有节点初始化完成后,查看集群状态。节点的STATUS字段为NotReady,由于版本不同,显示的结果可能也不同,如果是NotReady,安装完CNI即可变成Ready状态
在Master01节点安装Calico
cd /root/k8s-ha-install && git checkout manual-installation-v1.27.x && cd calico/
修改Pod网段为自己配置的Pod网段
POD_SUBNET=cat /etc/kubernetes/manifests/kube-controller-manager.yaml | grep cluster-cidr= | awk -F= '{print $NF}'
替换calico.yaml
sed -i "s#POD_CIDR#${POD_SUBNET}#g" calico.yaml
kubectl apply -f calico.yaml
创建完成后,查看容器和节点状态,均已Running;节点状态正常,均已Ready
3.部署Keepalived和HAproxy高可用集群
博主选择3台设备安装HAProxy和KeepAlived,你们可能根据自己情况选择。
3.1 yum安装HAProxy和KeepAlived
yum install keepalived haproxy -y
3.2 配置HAProxy
mv /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.bak
mkdir /etc/haproxy
vim /etc/haproxy/haproxy.cfg
global
maxconn 2000
ulimit-n 16384
log 127.0.0.1 local0 err
stats timeout 30s
defaults
log global
mode http
option httplog
timeout connect 5000
timeout client 50000
timeout server 50000
timeout http-request 15s
timeout http-keep-alive 15s
frontend monitor-in
bind *:33305
mode http
option httplog
monitor-uri /monitor
frontend k8s-master
bind 0.0.0.0:16443 # 监听的端口
bind 127.0.0.1:16443
mode tcp
option tcplog
tcp-request inspect-delay 5s
default_backend k8s-master
backend k8s-master
mode tcp
option tcplog
option tcp-check
balance roundrobin
default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
server k8s-master01 192.168.163.11:6443 check # 配置后端服务器地址
server k8s-master02 192.168.163.12:6443 check
server k8s-master03 192.168.163.13:6443 check
配置KeepAlived
mv /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak
mkdir /etc/keepalived
vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id LVS_DEVEL
script_user root
enable_script_security
}
vrrp_script chk_apiserver {
script "/etc/keepalived/check_apiserver.sh"
interval 5
weight -5
fall 2
rise 1
}
vrrp_instance VI_1 {
state MASTER
interface ens33 # 本机网卡名称
mcast_src_ip 192.168.163.11 # 本机IP地址
virtual_router_id 51
priority 101
advert_int 2
authentication {
auth_type PASS
auth_pass K8SHA_KA_AUTH
}
virtual_ipaddress {
192.168.163.150 # VIP地址,需要是宿主机同网段且不存在的IP地址
}
track_script {
chk_apiserver
}
}
4.搭建MySQL数据库,配置主从复制、读写分离
4.1 安装mysql仓库
方法1:
下载mysql官方仓库包
wget https://repo.mysql.com/mysql80-community-release-el7-1.noarch.rpm
安装mysql官方仓库
rpm -ivh mysql80-community-release-el7-1.noarch.rpm 或 yum -y install mysql80-community-release-el7-3.noarch.rpm
方法2:
添加 MySQL 官方 Yum 仓库:
yum localinstall https://dev.mysql.com/get/mysql80-community-release-el7-3.noarch.rpm
4.2 安装mysql server
yum install -y mysql-community-server
4.3 登录并修改初始密码
grep 'A temporary password' /var/log/mysqld.log
4.4 配置主从复制
主数据库master
vim /etc/my.cnf
server-id = 154 #配置server-id,让主服务器有唯一ID号(让从服务器知道他的主服务器是谁),建议使用ip最后3位
log-bin = mysql-bin #打开Mysql日志,日志格式为二进制。(主从复制依赖于二进制日志)
#binlog-do-db = your_database_name #如果只需要同步某个数据库
重启数据库:
systemctl restart mysqld
创建一个用于复制的用户
create user 'copy_user'@'%' IDENTIFIED with mysql_native_password by 'xxxxx';
GRANT REPLICATION SLAVE ON . TO 'copy_user'@'%';
FLUSH PRIVILEGES;
在配置完成后,获取当前二进制日志文件名和位置,以便从服务器同步时使用
SHOW MASTER STATUS; 或者 SHOW MASTER STATUS\G;
从数据库slave
vim /etc/my.cnf
server-id = 155 #配置server-id,让主服务器有唯一ID号(让从服务器知道他的主服务器是谁),建议使用ip最后3位
relay_log = mysql-relay #打开Mysql中继日志
read_only = 1 #设置只读权限
log_bin = mysql-bin #开启从服务器二进制日志
#log_slave_updates = 1 #使得更新的数据写进二进制日志中
重启数据库:
systemctl restart mysqld
配置从服务器连接主服务器:
CHANGE MASTER TO
master_host = '192.168.42.28', #主库的IP地址
master_user = 'copy', #在主库上创建的复制账号
master_password = 'Nebula@123', #在主库上创建的复制账号密码
master_log_file = 'mysql-bin.000001', #开始复制的二进制文件名(从主库
查询结果中获取)
master_log_pos = 817; #开始复制的二进制文件位置(从主
库查询结果中获取)
启动复制线程
start slave;
检查从服务器的复制状态
SHOW SLAVE STATUS\G
5.部署Flask应用至worker节点
pachong.py
#----------------------------------------------------------------------------------------------------------------------#
import os
from selenium import webdriver
#from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup
import time
import csv
import concurrent.futures
import threading
import sys
#import requests
from selenium.webdriver.remote.webdriver import WebDriver # 正确的导入
#----------------------------------------------------------------------------------------------------------------------#
def process_job(job,writer):
try:
company_name = job.find_element(By.CSS_SELECTOR, ".company-name a").text # 获取公司名称
job_name = job.find_element(By.CSS_SELECTOR, ".job-name").text # 获取岗位名称
job_area = job.find_element(By.CSS_SELECTOR,".job-area").text # 岗位地址
jingyan = (job.find_elements(By.CSS_SELECTOR,".tag-list li"))[0].text #经验
xueli = (job.find_elements(By.CSS_SELECTOR,".tag-list li"))[1].text #学历
guimo = (job.find_elements(By.CSS_SELECTOR,".company-tag-list li"))[2].text #规模
salary = job.find_element(By.CSS_SELECTOR, ".salary").text # 薪水
info_dsec = job.find_element(By.CSS_SELECTOR, ".info-desc").text # 福利待遇
writer.writerow([company_name, job_name, job_area, guimo, jingyan, xueli, salary, info_dsec]) # 将岗位信息写入文件
except Exception as e:
print(f"处理岗位失败: {e}")
#----------------------------------------------------------------------------------------------------------------------#
def pachong(gwname,city,experience,companyPeopleNumber,xueli):
# 设置Edge浏览器选项
chrome_options = Options()
chrome_options.add_argument("--headless") # 无头模式,不显示浏览器界面
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--no-sandbox")
# 新增:
chrome_options.add_argument("--disable-extensions")
chrome_options.add_argument("--incognito")
# chrome_options.add_argument("start-maximized") # 最大化浏览器窗口
chrome_options.add_argument("--disable-blink-features=AutomationControlled") # 禁用 WebDriver 控制的特征
chrome_options.add_argument("user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36") # 模拟正常浏览器
# 第二次新增
chrome_options.add_argument("--disable-javascript") # 禁用JavaScript
chrome_options.add_argument("--disable-cookies") # 禁用Cookies
#Selenium 在执行 driver.get(url) 时,Chrome 浏览器的标签页崩溃了,导致 WebDriver 会话被删除。
#尝试增加 Chrome 启动时的内存限制或禁用一些不必要的功能。
# 你可以尝试传递更多的 Chrome 配置选项来优化资源使用,尤其是 --disable-dev-shm-usage,这个选项可以防止 Chrome 在 Docker 容器中因共享内存不足而崩溃。
chrome_options.add_argument("--disable-dev-shm-usage") # 禁用/dev/shm使用,避免内存不足导致崩溃
# ----------------------------------------------------------------------------------------------------------------------#
# Selenium Grid 的 URL,指向我们在 Kubernetes 中创建的 Service
# selenium_url = "http://selenium-service:4444/wd/hub"
# 从环境变量获取 Selenium 服务地址
selenium_url = os.getenv("SELENIUM_URL", "http://selenium-service:4444/wd/hub")
#/wd/hub 是一个常见的路径,它是用于与 Selenium Hub 进行通信的默认 REST API 路径。# 它的作用是让客户端(比如你的爬虫容器)能够通过 HTTP 协议与 Selenium Hub 进行交互,执行浏览器自动化操作。
# 设置驱动路径(需要下载相应的driver) # 请替换为实际的driver路径
# driver_path = "/app/chromedriver"
# service = Service(driver_path) # 使用 Service 来指定 chromedriver 路径
# 创建浏览器对象
# driver = webdriver.Chrome(service=service, options=chrome_options)
driver = webdriver.Remote( # 启动 Remote WebDriver
command_executor=selenium_url,
options=chrome_options
)
#----------------------------------------------------------------------------------------------------------------------#
#岗位gangwei.html 传进 url的 value值
# 调试:检查是否有字段为空
# ----------------------------------------------------------------------------------------------------------------------#
# 城市
if not city:
cs = ""
city_pachong = city_chinese = ""
else:
cs = ",城市_"
city_pachong, city_chinese = city.split(" ")[0], city.split(" ")[1]
# ----------------------------------------------------------------------------------------------------------------------#
# 工作经验
if not experience:
jy = ""
experience_pachong = experience_chinese = ""
else:
jy = ",经验_"
experience_pachong, experience_chinese = experience.split(" ")[0], experience.split(" ")[1]
# ----------------------------------------------------------------------------------------------------------------------#
# 公司规模
if not companyPeopleNumber:
gm = ""
companyPeopleNumber_pachong = companyPeopleNumber_chinese = ""
else:
gm = ",规模_"
companyPeopleNumber_pachong, companyPeopleNumber_chinese = companyPeopleNumber.split(" ")[0], companyPeopleNumber.split(" ")[1]
# ----------------------------------------------------------------------------------------------------------------------#
# 学历
if not xueli:
xl = ""
xueli_pachong = xueli_chinese = ""
else:
xl = ",学历_"
xueli_pachong, xueli_chinese = xueli.split(" ")[0], xueli.split(" ")[1]
# ----------------------------------------------------------------------------------------------------------------------#
page = 1 # 页数,共10页
#----------------------------------------------------------------------------------------------------------------------#
url = f'https://www.zhipin.com/web/geek/job?query={gwname}&city={city_pachong}&experience={experience_pachong}' \
f'°ree={xueli_pachong}&scale={companyPeopleNumber_pachong}&page={page}' # 目标页面 URL信息。
# 共享文件夹挂载到容器中的 /app/pachong_filepath 目录
folder_path = "/app/pachong_download_file" # 使用容器内挂载的共享路径
# 如果文件夹不存在,创建文件夹
if not os.path.exists(folder_path):
os.makedirs(folder_path)
# 生成文件名
file_name = f"岗位_{gwname}{cs}{city_chinese}{gm}{companyPeopleNumber_chinese}{jy}{experience_chinese}{xl}{xueli_chinese}.csv"
# 生成文件的完整路径
full_file_path = os.path.join(folder_path,file_name)
#----------------------------------------------------------------------------------------------------------------------#
# 打开页面
driver.get(url)
#等待页面内容加载(使用 WebDriverWait 等待元素加载)
ele = WebDriverWait(driver, 60).until( # 等待页面中岗位信息加载出来
EC.presence_of_element_located((By.CLASS_NAME, "job-card-left"))
)
# tishi = None
# if ele == None:
# tishi = "加载页面超时或出错。未找到满足要求的岗位,请尝试更换查询条件!"
# 找到所有岗位的容器 #print("job_list",job_list)
job_list = driver.find_elements(By.CSS_SELECTOR, ".job-list-box .job-card-wrapper")
#----------------------------------------------------------------------------------------------------------------------#
# 打开文件准备写入数据
with open(full_file_path, 'w', newline='', encoding='utf-8') as f:
writer = csv.writer(f, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
# 写入文件的标题
writer.writerow(["公司名称", "岗位名称", "地点信息" , "公司规模" , "工作经验" , "学历要求" ,"薪水" , "福利待遇"])
# 使用 ThreadPoolExecutor 来并行处理多个岗位
with concurrent.futures.ThreadPoolExecutor(max_workers=32) as executor:
# 传递 job_list 和 writer 到 process_job 函数
executor.map(lambda job: process_job(job, writer), job_list)
#----------------------------------------------------------------------------------------------------------------------#
# 关闭浏览器
driver.quit()
#返回文件路径、文件名: (app.py中岗位页面调用的pachong()方法)
return full_file_path,file_name
#----------------------------------------------------------------------------------------------------------------------#
app.py
import pachong02
from flask import Flask, render_template, request, redirect, url_for, session, flash
from flask_sqlalchemy import SQLAlchemy
from flask_socketio import SocketIO, send
import hashlib
import subprocess
import os
import time
from flask import send_from_directory
import sys
import csv
from flask import render_template, send_from_directory
from flask import send_file
from pachong02 import pachong
import socket
import pymysql
pymysql.install_as_MySQLdb() # 使用 PyMySQL 替代 MySQLdb.(因为python3.9-slim中没有mysqldb,会报错)
from flask import jsonify
# 初始化 Flask 应用
app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'mysql://root:001928_Llt@192.168.163.154/html01' # 请根据你的数据库配置修改
app.config['SQLALCHEMY_TRACK_MODIFICATIONS'] = False
app.secret_key = 'your_secret_key' # 使用你生成的实际密钥+
# 初始化数据库和 SocketIO
db = SQLAlchemy(app)
socketio = SocketIO(app)
# 定义用户模型z
class User(db.Model):
id = db.Column(db.Integer, primary_key=True)
username = db.Column(db.String(50), unique=True, nullable=False)
password = db.Column(db.String(255), nullable=False)
created_at = db.Column(db.DateTime, default=db.func.current_timestamp())
class History(db.Model):
__tablename__ = 'history' # 确保你已经设置表名
id = db.Column(db.Integer, primary_key=True)
user_id = db.Column(db.Integer, db.ForeignKey('user.id'), nullable=False)
gangwei_name = db.Column(db.String(255), nullable=True)
city = db.Column(db.String(255), nullable=True)
scale = db.Column(db.String(255), nullable=True)
experience = db.Column(db.String(255), nullable=True)
xueli = db.Column(db.String(255), nullable=True)
used_time = db.Column(db.String(255), nullable=True)
filename = db.Column(db.String(255), nullable=True)
filepath = db.Column(db.String(255), nullable=True)
created_at = db.Column(db.DateTime, default=db.func.current_timestamp())
user = db.relationship('User', backref=db.backref('history', lazy=True))
def __repr__(self):
return f"<History {self.id}>"
# 在应用上下文中创建数据库表
with app.app_context():
db.create_all()
# ___________________________________________________________________________________________________________________________
# 首页路由
@app.route('/')
def index():
# pod_ip = socket.gethostbyname(socket.gethostname())
# print(f"This request was handled by Pod with IP: {pod_ip}")
return render_template('shouye.html')
# 用户注册
@app.route('/register', methods=['GET', 'POST'])
def register():
# pod_ip = socket.gethostbyname(socket.gethostname())
if request.method == 'POST':
username = request.form['username']
password = hashlib.sha256(request.form['password'].encode()).hexdigest()
# 检查用户名是否已存在
existing_user = User.query.filter_by(username=username).first()
if existing_user:
# 如果用户名已存在,返回一个错误提示
flash('用户名已存在,请选择其他用户名', 'error')
return redirect(url_for('register'))
# 用户名不存在,创建新用户
user = User(username=username, password=password)
db.session.add(user)
db.session.commit()
# 将用户名保存在 session 中
session['user_id'] = user.id
session['username'] = user.username
return redirect(url_for('gangwei'))
return render_template('register.html')
# 用户登录
@app.route('/login', methods=['GET', 'POST'])
def login():
# pod_ip = socket.gethostbyname(socket.gethostname())
if request.method == 'POST':
username = request.form['username']
password = hashlib.sha256(request.form['password'].encode()).hexdigest()
user = User.query.filter_by(username=username, password=password).first()
if user:
session['user_id'] = user.id
session['username'] = user.username # 将用户名保存在 session 中
return redirect(url_for('gangwei'))
else:
return '用户名或密码错误!', 403
return render_template('login.html')
# 用户退出
@app.route('/logout')
def logout():
# pod_ip = socket.gethostbyname(socket.gethostname())
session.pop('user_id', None)
session.pop('username', None) # 退出时清除用户名
return redirect(url_for('index'))
# 岗位页面
@app.route('/gangwei', methods=['GET', 'POST'])
def gangwei():
# pod_ip = socket.gethostbyname(socket.gethostname())
if 'user_id' not in session:
return redirect(url_for('login'))
user = db.session.get(User, session['user_id'])
no_jobs_found = False # 默认假设找到岗位
if request.method == 'POST':
# 开始计时
# start_time = round(time.time(),2)
start_time = time.time()
# 调试:检查是否有字段为空
if not request.form['gangwei_name']:
return "必填岗位字段,未填写完整,请检查并重试。", 400
gangwei_name = request.form['gangwei_name']
if not request.form['city']:
city_mysql = ""
else:city_mysql = request.form['city'].split(" ")[1]
if not request.form['experience']:
experience_mysql = ""
else:experience_mysql = request.form['experience'].split(" ")[1]
if not request.form['guimo']:
guimo_mysql = ""
else:guimo_mysql = request.form['guimo'].split(" ")[1]
if not request.form['xueli']:
xueli_mysql = ""
else:xueli_mysql = request.form['xueli'].split(" ")[1]
# 调用爬虫函数
# 调用爬虫,并记录文件路径
filepath, filename = pachong(gangwei_name, request.form['city'], request.form['experience']
,request.form['guimo'],request.form['xueli'])
# 检查爬虫是否成功返回数据
if not filepath: # 这里可以通过一些方法检查是否返回了岗位数据文件
no_jobs_found = True
#结束计时
end_time = time.time()
used_time = format(end_time - start_time , ".2f")+'s'
used_time02 = end_time - start_time
# 以将查询信息保存到数据库,示例如下:
history = History(user_id=session['user_id'], gangwei_name=gangwei_name, city=city_mysql, scale=guimo_mysql,
filepath=filepath, experience=experience_mysql, filename=filename, xueli=xueli_mysql,used_time=used_time)
db.session.add(history)
db.session.commit()
# 查询最新保存的历史记录
# history = History.query.filter_by(user_id=session['user_id']).order_by(History.created_at.desc()).first()
# 渲染模板,传递历史记录和文件路径
# return render_template('gangwei.html', username=user.username, history=history, no_jobs_found=no_jobs_found)
return jsonify(success=True, time_used=used_time02, message="查询成功")
# GET 请求时渲染模板
return render_template('gangwei.html', username=user.username)
# 历史记录页面
@app.route('/jilu')
def jilu():
# pod_ip = socket.gethostbyname(socket.gethostname())
if 'user_id' not in session:
return redirect(url_for('login'))
# 获取用户的历史查询记录
history = History.query.filter(History.user_id == session['user_id']).all()
return render_template('jilu.html', history=history)
# 记录详情,可视化页面
@app.route('/view_file/<int:history_id>')
def view_file(history_id):
# pod_ip = socket.gethostbyname(socket.gethostname())
# 查询数据库,获取该历史记录的信息
record = History.query.get_or_404(history_id)
full_file_path = record.filepath
# file_name = record.filename
#
# # 假设文件路径是相对于共享目录的路径,拼接出完整的文件路径
# shared_directory = '/app/pachong_download_file' # 容器内挂载的共享目录
# full_file_path = os.path.join(shared_directory, file_name)
# 确保文件存在
if not os.path.exists(full_file_path):
return "文件未找到", 404
# 打开 CSV 文件并读取内容
rows = []
with open(full_file_path, newline='', encoding='utf-8') as file:
csv_reader = csv.reader(file)
for row in csv_reader:
rows.append(row)
# 在此可以处理查看文件的逻辑,比如展示文件内容或返回文件路径等
return render_template('view_file.html', record=record, rows=rows)
@app.route('/download_file/<int:history_id>')
def download_file(history_id):
# 查询数据库,获取该历史记录的信息
record = History.query.get_or_404(history_id)
full_file_path = record.filepath
# #获取文件名
# file_name = record.filename
# # 假设文件路径是相对于共享目录的路径,拼接出完整的文件路径
# shared_directory = '/app/pachong_download_file' # 容器内挂载的共享目录
# full_file_path = os.path.join(shared_directory, file_name)
# 确保文件存在
if not os.path.exists(full_file_path):
return "文件未找到", 404
return send_file(full_file_path, as_attachment=True)
# 启动flask应用
if __name__ == "__main__": ##启动应用
socketio.run(app, host="0.0.0.0", port=5002)
html:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>岗位查询系统</title>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/bootstrap/5.1.0/css/bootstrap.min.css">
</head>
<body>
<div class="container mt-5">
<h4>岗位查询系统</h4>
<div class="card">
<div class="card-body">
<a href="{{ url_for('register') }}" >注册</a><br><br>
<a href="{{ url_for('login') }}" >登录</a><br>
<!-- <br>This request was handled with IP: {{pod_ip}}-->
</div>
</div>
</div>
<script src="https://cdnjs.cloudflare.com/ajax/libs/bootstrap/5.1.0/js/bootstrap.bundle.min.js"></script>
</body>
</html>
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>注册</title>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/bootstrap/5.1.0/css/bootstrap.min.css">
<style>
.error {
color: red;
font-size: 0.9em;
margin-left: 10px; /* 控制错误消息与输入框之间的距离 */
vertical-align: middle; /* 确保错误信息垂直居中对齐 */
display: inline; /* 确保错误信息与输入框在同一行 */
}
</style>
</head>
<body>
<div class="container mt-5">
<h4>注册用户</h4>
<div class="card">
<div class="card-body">
<form method="POST">
<input type="text" name="username" placeholder="用户名" required>
<!-- 显示用户名已存在的提示信息 -->
{% if 'error' in get_flashed_messages() %}
<span class="error">用户名已存在</span>
{% endif %}
<!-- 显示用户名已存在的错误消息 -->
{% with messages = get_flashed_messages(with_categories=true) %}
{% if messages %}
{% for category, message in messages %}
{% if category == 'error' %}
<p class="error">{{ message }}</p>
{% endif %}
{% endfor %}
{% endif %}
{% endwith %}
<br><input type="password" name="password" placeholder="密码" required><br>
<button type="submit" style="border: 1px solid black;">注册并登录</button>
</form>
<br><a href="{{ url_for('logout') }}">回到主界面</a> <!-- 退出按钮 -->
<!-- <br>This request was handled with IP: {{pod_ip}}-->
</div>
</div>
</div>
<script src="https://cdnjs.cloudflare.com/ajax/libs/bootstrap/5.1.0/js/bootstrap.bundle.min.js"></script>
</body>
</html>
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>登录</title>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/bootstrap/5.1.0/css/bootstrap.min.css">
</head>
<body>
<div class="container mt-5">
<h4>登录</h4>
<div class="card">
<div class="card-body">
<form method="POST">
<input type="text" name="username" placeholder="用户名" required><br>
<input type="password" name="password" placeholder="密码" required><br>
<button type="submit" style="border: 1px solid black;">登录</button>
</form>
<br><a href="{{ url_for('logout') }}">回到主界面</a> <!-- 退出按钮 -->
<!-- <br>This request was handled with IP: {{pod_ip}}-->
</div>
</div>
</div>
<script src="https://cdnjs.cloudflare.com/ajax/libs/bootstrap/5.1.0/js/bootstrap.bundle.min.js"></script>
</body>
</html>
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>历史记录,可以查看预览,以及可以下载文件保存</title>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/bootstrap/5.1.0/css/bootstrap.min.css">
</head>
<body>
<div class="container mt-5">
<h4 style="display: inline;">历史记录</h4>
<a style="float: right;" href="{{ url_for('gangwei') }}">返回查询界面</a>
<div class="card">
<div class="card-body">
<a style="float: right;" href="{{ url_for('logout') }}">退出登录</a><br>
<!-- <!–//这里的代码还没想好怎么写–>-->
<!-- <!–//这里展示历史查询的记录(每行包括:查询的岗位名、城市名、公司规模、查询时间、查看文件、下载保存),可以打开预览,以及可以下载文件保存–><!– 如果有历史记录,则显示 –>-->
<!-- 展示历史查询的记录(每行包括:查询的岗位名、城市名、公司规模、查询时间、查看文件、下载保存)-->
{% if history %}
<div class="table-responsive"> <!-- 添加响应式类,使表格在小屏幕下可滚动 -->
<table class="table table-bordered table-striped">
<thead>
<tr>
<th>岗位名称</th>
<th>城市</th>
<th>公司规模</th>
<th>工作经验</th>
<th>学历</th>
<th>查询时间</th>
<th>用时</th>
<th>查看文件</th>
<th>下载文件</th>
</tr>
</thead>
<tbody>
{% for record in history %}
<tr>
<td>{{ record.gangwei_name }}</td>
<td>{{ record.city }}</td>
<td>{{ record.scale }}</td>
<td>{{ record.experience }}</td>
<td>{{ record.xueli }}</td>
<td>{{ record.created_at }}</td>
<td>{{ record.used_time }}</td>
<td><a href="{{ url_for('view_file', history_id=record.id) }}">查看</a></td>
<td><a href="{{ url_for('download_file', history_id=record.id) }}">下载</a></td>
</tr>
{% endfor %}
</tbody>
<!-- <br>This request was handled with IP: {{pod_ip}}-->
</table>
</div>
{% else %}
<p>暂无历史记录。</p>
{% endif %}
<br>
</div>
</div>
</div>
<script src="https://cdnjs.cloudflare.com/ajax/libs/bootstrap/5.1.0/js/bootstrap.bundle.min.js"></script>
</body>
</html>
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>搜索岗位</title>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/bootstrap/5.1.0/css/bootstrap.min.css">
<style>
/* 遮罩层 */
#overlay {
display: none;
position: fixed;
top: 0;
left: 0;
width: 100%;
height: 100%;
background-color: rgba(0, 0, 0, 0.5);
z-index: 1000;
}
/* 弹窗 */
#popup {
display: none;
position: fixed;
top: 50%;
left: 50%;
transform: translate(-50%, -50%);
width: 300px;
padding: 20px;
background: white;
box-shadow: 0 4px 8px rgba(0, 0, 0, 0.2);
border-radius: 8px;
text-align: center;
z-index: 1001;
}
<!-- /* 关闭按钮 */-->
<!-- #popup button {-->
<!-- margin-top: 10px;-->
<!-- padding: 5px 10px;-->
<!-- background-color: #6C757D;/* 灰色 */-->
<!-- color: white;-->
<!-- border: none;-->
<!-- cursor: pointer;-->
<!-- }-->
<!-- /* 设置按钮悬停时的效果 */-->
<!-- #popup button:hover {-->
<!-- background-color: #343A40;/* 深灰色 */-->
<!-- }-->
</style>
</head>
<body>
<div class="container mt-5">
<h4 style="display: inline;">欢迎来到岗位查询系统</h4>
<a style="float: right;" href="{{ url_for('logout') }}">退出登录</a>
<div class="card">
<div class="card-body">
<a style="margin-right: 130px;">你好:{{ username }}</a><br><br>
<a>说明:根据个人需求,填写岗位信息。提交查询后等待数秒,待显示查询完毕后,即可到历史记录查看结果。</a><br><br>
<form method="POST" id="query-form">
<a>岗位:</a>
<input size=10 type="text" name="gangwei_name" placeholder="请输入岗位" required>
<a> 城市:</a>
<select name="city">
<option value="">请选择城市</option>
<option value="100010000 全国">全国</option>
<option value="101010100 北京">北京</option>
<option value="101020100 上海">上海</option>
<option value="101280100 广州">广州</option>
<option value="101280600 深圳">深圳</option>
<option value="101210100 杭州">杭州</option>
<option value="101030100 天津">天津</option>
<option value="101110100 西安">西安</option>
<option value="101190400 苏州">苏州</option>
<option value="101200100 武汉">武汉</option>
<option value="101230200 厦门">厦门</option>
<option value="101250100 长沙">长沙</option>
<option value="101270100 成都">成都</option>
<option value="101180100 郑州">郑州</option>
<option value="101040100 重庆">重庆</option>
</select>
<a> 规模:</a>
<select name="guimo">
<option value="">请选择规模</option>
<option value="301 0-20人">0-20人</option>
<option value="302 20-99人">20-99人</option>
<option value="303 100-499人">100-499人</option>
<option value="304 500-999人">500-999人</option>
<option value="305 1000-9999人">1000-9999人</option>
<option value="306 10000人以上">10000人以上</option>
</select>
<a> 经验:</a>
<select name="experience">
<option value="">请选择经验</option>
<option value="0 不限">不限</option>
<option value="108 在校生">在校生</option>
<option value="102 应届生">应届生</option>
<option value="101 经验不限">经验不限</option>
<option value="103 1年以内">1年以内</option>
<option value="104 1-3年">1-3年</option>
<option value="105 3-5年">3-5年</option>
<option value="106 5-10年">5-10年</option>
<option value="107 10年以上">10年以上</option>
</select>
<a> 学历:</a>
<select name="xueli">
<option value="">请选择学历</option>
<option value="0 不限">不限</option>
<option value="209 初中及以下">初中及以下</option>
<option value="208 中专中技">中专中技</option>
<option value="206 高中">高中</option>
<option value="202 大专">大专</option>
<option value="203 本科">本科</option>
<option value="204 硕士">硕士</option>
<option value="205 博士">博士</option>
</select>
<a> </a>
<button type="submit" style="border: 1px solid black; padding: 0px 5px; border-radius: 5px; ">提交查询</button>
<br><br>
</form>
<a>记录:</a><a href="{{ url_for('jilu') }}">历史记录</a>
<!-- 实现弹出一个小提示框,显示查询中...-->
<!-- 显示爬取完毕,用时xx秒钟-->
</div>
</div>
</div>
<!--------------------------------------------------------------------------------------------------------------------->
<!--------------------------------------------------------------------------------------------------------------------->
<!-- 遮罩层和弹窗 -->
<div id="overlay"></div>
<div id="popup">
<p id="popup-message">查询中...</p>
<button id="popup-close">关闭</button>
</div>
<script src="https://cdnjs.cloudflare.com/ajax/libs/bootstrap/5.1.0/js/bootstrap.bundle.min.js"></script>
</body>
<script>
const form = document.getElementById('query-form');
const overlay = document.getElementById('overlay');
const popup = document.getElementById('popup');
const popupMessage = document.getElementById('popup-message');
const popupCloseButton = document.getElementById('popup-close');
// 显示弹窗
function showPopup(message) {
overlay.style.display = 'block';
popup.style.display = 'block';
popupMessage.textContent = message;
}
// 关闭弹窗
function hidePopup() {
overlay.style.display = 'none';
popup.style.display = 'none';
}
// 绑定关闭按钮的点击事件
popupCloseButton.addEventListener('click', hidePopup);
// 提交表单时的处理逻辑
form.addEventListener('submit', function (event) {
event.preventDefault(); // 阻止默认提交行为
// 显示查询中弹窗
showPopup('查询中...');
// 获取表单数据
const formData = new FormData(form);
// 发送AJAX请求
fetch('/gangwei', {
method: 'POST',
body: formData
})
.then(response => response.json())
.then(data => {
// 检查是否查询成功
if (data.success) {
showPopup(`查询完毕!用时 ${data.time_used.toFixed(2)} 秒`);
} else {
showPopup('查询失败,请检查条件!');
}
//2秒后关闭弹窗
setTimeout(hidePopup, 2000);
})
.catch(error => {
console.error('Error:', error);
showPopup('查询失败,请稍后重试!');
//2秒后关闭弹窗
setTimeout(hidePopup, 2000);
});
});
</script>
</html>
6.搭建NFS共享存储,创建pv、pvc
6.1 安装NFS 客户端工具
安装 NFS 服务器:
yum install nfs-utils
创建一个共享目录并设置权限:
mkdir -p /mnt/data/share
chmod 777 /mnt/data/share
编辑 /etc/exports 文件,配置允许 Kubernetes 节点挂载该目录:
echo "/mnt/data/share *(rw,sync,no_root_squash)" | sudo tee -a /etc/exports
启动并启用 NFS 服务:
systemctl start nfs-server
systemctl enable nfs-server
导出文件系统:
exportfs -a
确保防火墙允许 NFS 流量:
firewall-cmd --permanent --zone=public --add-service=nfs
firewall-cmd --reload
配置 Kubernetes 中的 PV 和 PVC
创建一个 PV 配置文件 nfs-pv.yaml:
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv
spec:
capacity:
storage: 10Gi # 可以根据需要调整存储大小
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
nfs:
path: /mnt/data/share # 在虚拟机上的共享路径
server: node01 # 虚拟机的 IP 地址
创建一个 PVC 配置文件 nfs-pvc.yaml:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nfs-pvc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
应用 PV 和 PVC 配置文件:
kubectl apply -f nfs-pv.yaml
kubectl apply -f nfs-pvc.yaml
在容器中挂载 NFS 存储
apiVersion: v1
kind: Pod
metadata:
name: pachong-pod
spec:
containers:
- name: pachong-container
image: your-image
volumeMounts:
- mountPath: /app/pachong_download_file # 容器内的挂载路径
name: nfs-volume
volumes:
- name: nfs-volume
persistentVolumeClaim:
claimName: nfs-pvc # PVC 名称
应用该配置:
kubectl apply -f pachong-pod.yaml
7.安装内网穿透工具,配置端口转发
Domains – ngrok
安装:打开CMD,在ngrox目录下执行
choco install ngrok
ngrok config add-authtoken 2r43Wri1sdjfklsadjlksjdflkasdW_3vJvmsEfJcG6lsdkjf
点击左侧domains。如图,可复制地址,ngrok http –url=xxxxxx-xxxxxx-xxxxx.ngrok-free.app 80
即可将宿主机80端口转发到集群VIP端口上,实现项目的公网访问。
8.最终效果
作者:塑梦_