Nov 29, 2023 • 약 5분 (1,076 단어) • ko

Journey to Creating DH Lee Chatbot - Part 1

카테고리

태그

Restarting CPS Season 2

I conducted a Python course called CPS (Crash Python course for SANS family) for about 3 months starting from early 2023. I was busy gaslighting the audience that “Python is easy and fun to learn and you can do so much with it!” when suddenly ChatGPT appeared like a comet, and during the course, even I as the instructor couldn’t tell if this was a Python course or a “How to Use ChatGPT Well” course - it became that scattered as I finished the 3-month course.

Professor Noh Su-rim, who manipulated me into creating the course and enthusiastically attended it, reverse-gaslighted me into starting Season 2 in early 2024. This time, the core of next year’s course will be to actually do something using LLM (Large Language Model) pre-trained models like GPT3.5 and Llama.

And I also decided to start blogging daily, something I was too lazy to do before. Let’s see how this goes…

The first thing to do is to convert the numerous lecture videos, voice recordings, and chat logs of the respected DH Lee through Voice to Text and other frameworks, label and preprocess the data, train it on a pretrained model, and create a DH Lee AI Model API Service.

I want to first use this API as a Chatbot on Telegram, Slack, KakaoTalk, etc., and later collaborate with Midjourney and other commercial visual AI services.

Well, the goal for now is for me as the instructor to quickly work through all the steps, and I plan to document this process on the blog.

And I’m daily discussing and detailing this process through ChatGPT4’s voice chat service, and recently the features have gotten so good that it even creates documents - how can I not post this on the blog?

That concludes my introduction…

Below, I’ll record the content I discussed with ChatGPT4’s AI assistant.

Guide to Transcribing DH Lee’s Voice Data and Training with GPT-3.5

Transcribing Voice to Text Using Microsoft Azure

Setting Up Microsoft Azure Speech Service

Create an Azure Account and enable billing.
Create a Speech Service Resource in Azure Portal and obtain your API key and endpoint URL.
Install Azure SDK for Python using pip install azure-cognitiveservices-speech.

Python Script for Transcription

import azure.cognitiveservices.speech as speechsdk

def transcribe_audio(file_path, service_region, subscription_key):
    speech_config = speechsdk.SpeechConfig(subscription=subscription_key, region=service_region)
    audio_input = speechsdk.AudioConfig(filename=file_path)

    speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_input)

    result = speech_recognizer.recognize_once_async().get()
    return result.text

subscription_key = "YourAzureSubscriptionKey"
service_region = "YourServiceRegion"
audio_file_path = "path/to/your/audiofile.wav"

transcription = transcribe_audio(audio_file_path, service_region, subscription_key)
print(transcription)

Replace "YourAzureSubscriptionKey", "YourServiceRegion", and "path/to/your/audiofile.wav" with your details.

Post-Transcription Steps

Review and correct the transcriptions.
Store the transcriptions in a structured format.

Training Transcribed Data with GPT-3.5

Data Preparation for GPT-3.5

Organize and clean the transcribed data.
Format the data as per OpenAI’s guidelines for fine-tuning.

Fine-Tuning Process

Access the GPT-3.5 API through OpenAI, which may offer fine-tuning capabilities.
Upload your prepared dataset to OpenAI and fine-tune the model on this data.
Define training parameters as needed.

Integration and Testing

Integrate the fine-tuned GPT-3.5 model into your application via API.
Test the model’s performance to ensure it aligns with DH Lee’s style and content.

Considerations

Ensure you have adequate computational resources and expertise.
Obtain DH Lee’s consent and consider ethical aspects of using his data.
Be mindful of the costs associated with API usage and training.

Note: This document is a summarized guide based on the conversation and should be adapted for specific project needs.

Connection Lost

Server Hiccup