国产资源一区二区,国产一区二区三区播放,91精东传媒

當(dāng)前，在人工智能領(lǐng)域，大模型在豐富人工智能應(yīng)用場景中扮演著重要的角色，經(jīng)過不斷的探索，大模型進(jìn)入到落地的階段。而大模型在落地過程中面臨兩大關(guān)鍵難題：對龐大計(jì)算資源的需求和對數(shù)據(jù)隱私與安全的考量。為應(yīng)對這些挑戰(zhàn)，在邊緣側(cè)私有化部署大模型成為了一個(gè)有效的解決方案。

將大模型部署到邊緣側(cè)，不僅能夠減少延遲和帶寬消耗，使得大模型能夠在邊緣節(jié)點(diǎn)快速進(jìn)行推理和應(yīng)用；還能增強(qiáng)數(shù)據(jù)隱私保護(hù)，這對于維護(hù)企業(yè)的數(shù)據(jù)安全至關(guān)重要。

為響應(yīng)市場需求，英碼科技推出了基于算能BM1684X平臺的大模型私有化部署產(chǎn)品方案，包括：邊緣計(jì)算盒子IVP03X-V2、云邊加速卡AIV02X和AIV03X，助力企業(yè)實(shí)現(xiàn)垂直大模型應(yīng)用落地！

▎邊緣計(jì)算盒子IVP03X-V2

IVP03X-V2是英碼科技基于BM1684X推出的高性能邊緣計(jì)算盒子，INT8算力高達(dá)32Tops，配置了16GB大內(nèi)存，支持適配Llama2-7B/ChatGLM3-6B/Qwen-7B和SAM/StableDiffusion等大模型，是業(yè)內(nèi)少數(shù)能同時(shí)兼容國內(nèi)外深度學(xué)習(xí)框架，并且能夠流暢運(yùn)行大語言模型推理的邊緣計(jì)算設(shè)備之一。

▎大模型推理加速卡AIV02X & AIV03X

AIV02X和AIV03X算力可達(dá)64 TOPS@INT8和72 TOPS@INT8，顯存配置32GB和48GB，支持多芯分布式推理及支持大語言/提示型/圖像生成模型等大模型推理；這兩款云邊大模型推理加速卡均可應(yīng)用于邊緣大語言、文生圖等通用大模型、垂直行業(yè)私有模型的推理應(yīng)用。

接下來，以英碼科技IVP03X邊緣計(jì)算盒子為例，為大家介紹實(shí)測大語言模型、文生圖大模型的部署流程和效果演示：

英碼科技IVP03X-V2實(shí)測大語言模型

一、前期準(zhǔn)備工作

demo下載地址：

https://github.com/sophgo/sophon-demo

二、大模型內(nèi)存配置

1、建一個(gè)存放工具的文件夾：

mkdir memedit && cd memedit

2、下載內(nèi)存配置工具：

wget -nd https://sophon-file.sophon.cn/sophon-prod-s3/drive/23/09/11/13/DeviceMemoryModificationKit.tgz tar xvf DeviceMemoryModificationKit.tgz cd DeviceMemoryModificationKit tar xvf memory_edit_{vx.x}.tar.xz #vx.x是版本號 cd memory_edit

3、重配內(nèi)存：

./memory_edit.sh -p #這個(gè)命令會打印當(dāng)前的內(nèi)存布局信息 ./memory_edit.sh -c -npu 7615 -vpu 3072 -vpp 3072 #npu也可以訪問vpu和vpp的內(nèi)存替換emmbboot.itb: sudo cp /data/memedit/DeviceMemoryModificationKit/memory_edit/emmcboot.itb /boot/emmcboot.itb && sync

4、重啟生效：

reboot 重啟后，檢查配置： free -h cat /sys/kernel/debug/ion/bm_npu_heap_dump/summary | head -2 cat /sys/kernel/debug/ion/bm_vpu_heap_dump/summary | head -2 cat /sys/kernel/debug/ion/bm_vpp_heap_dump/summary | head -2

三、實(shí)測Chat-GLM3大模型（英文模式）

1、demo下載（Chat-GLM3）

進(jìn)到Chat-GLM2案例目錄下：sophon-demo-release/sample/Chat-GLM2/

安裝pip3，安裝dfss

sudo apt install python3-pip pip3 install dfss -i https://pypi.tuna.tsinghua.edu.cn/simple/ pip3 install dfss --upgrade

下載模型：

sudo apt install unzip chmod -R +x scripts/ ./scripts/download.sh

2、安裝依賴

安裝python依賴：

pip3 install -r python/requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple/ Sail安裝包，下載安裝： python3 -m dfss --url=open@sophgo.com:sophon-demo/ChatGLM3/sail/soc/sophon_arm-3.7.0-py3-none-any.whl pip3 install sophon_arm-3.7.0-py3-none-any.whl

3、大模型運(yùn)行測試：

python3 python/chatglm3.py --bmodel models/BM1684X/chatglm3-6b_int4.bmodel --token python/token_config --dev_id 0 python3 python/chatglm3.py --bmodel models/BM1684X/chatglm3-6b_int8.bmodel --token python/token_config --dev_id 0 python3 python/chatglm3.py --bmodel models/BM1684X/chatglm3-6b_fp16.bmodel --token python/token_config --dev_id 0

四、實(shí)測Qwen大模型（中文模式）

1、demo下載（Qwen）

進(jìn)到Chat-GLM2案例目錄下：sophon-demo-release/sample/Qwen/

安裝pip3，安裝dfss

sudo apt install python3-pip pip3 install dfss -i https://pypi.tuna.tsinghua.edu.cn/simple/ pip3 install dfss --upgrade

下載模型：

sudo apt install unzip chmod -R +x scripts/ ./scripts/download.sh

2、安裝依賴

安裝python依賴：

3、大運(yùn)行測試：

python3 python/qwen.py --bmodel models/BM1684X/qwen-7b_int4_1dev.bmodel --token python/token_config --dev_id 0 python3 python/qwen.py --bmodel models/BM1684X/qwen-7b_int8_1dev.bmodel --token python/token_config --dev_id 0

英碼科技IVP03X-V2實(shí)測文生圖大模型

1、demo下載（StableDiffusionV1_5）

進(jìn)到Chat-GLM2案例目錄下：sophon-demo-release/sample/Qwen/

安裝pip3，安裝dfss

sudo apt install python3-pip pip3 install dfss -i https://pypi.tuna.tsinghua.edu.cn/simple/ pip3 install dfss --upgrade

下載模型：

sudo apt install unzip chmod -R +x scripts/ ./scripts/download_controlnets_bmodel.sh ./scripts/download_multilize_bmodel.sh ./scripts/download_singlize_bmodel.sh

2、安裝依賴

安裝python依賴：

pip3 install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple/ 安裝sail包 python3 -m dfss --url=open@sophgo.com:sophon-demo/ChatGLM3/sail/soc/sophon_arm-3.7.0-py3-none-any.whl

3、大模型運(yùn)行測試：

①文本生成圖像

迭代20次

python3run.py--model_path../models/BM1684X--stagesinglize--prompt"Aparrotrestingonabranch"--neg_prompt"worstquality"--num_inference_steps20--dev_id0

迭代500次

python3 run.py --model_path ../models/BM1684X --stage singlize --pr ompt "A parrot resting on a branch" --neg_prompt "worst quality" --num_inference_steps 500 --dev_id 0

② Controlnet插件輔助控制生成圖像

一只小兔子晚上在酒吧喝酒：迭代次數(shù)，20次

python3 run.py --model_path ../models/BM1684X --stage multilize --controlnet_name scribble_controlnet_fp16.bmodel --processor_name scribble_processor_fp16.bmodel --controlnet_img ../pics/generated_img.jpg --prompt "a rabbit drinking at the bar at night" --neg_prompt "worst quality" --num_inference_steps 100 --dev_id 0

一只小兔子晚上在酒吧喝酒：迭代次數(shù)，200次

結(jié)語

隨著大模型技術(shù)的不斷落地和應(yīng)用，大模型涌現(xiàn)的強(qiáng)大能力不再局限于云端，模型的算法正逐漸向邊緣端延伸；未來，英碼科技將結(jié)合自身在軟硬件方面的技術(shù)優(yōu)勢和豐富的經(jīng)驗(yàn)，以AI賦能更多企業(yè)低門檻、高效落地邊緣側(cè)大模型應(yīng)用，從而推動各行業(yè)智能化轉(zhuǎn)型的進(jìn)程。

審核編輯黃宇

聲明：本文內(nèi)容及配圖由入駐作者撰寫或者入駐合作網(wǎng)站授權(quán)轉(zhuǎn)載。文章觀點(diǎn)僅代表作者本人，不代表電子發(fā)燒友網(wǎng)立場。文章及其配圖僅供工程師學(xué)習(xí)之用，如有內(nèi)容侵權(quán)或者其他違規(guī)問題，請聯(lián)系本站處理。舉報(bào)投訴