모든것에 On-Premise에서 작동하는 Retrival Argmented Generation의 최소 예제입니다.

README.md

RAG-minimal-example#

모든것이 On-Premise에서 작동하는 Retrival Argmented Generation의 최소 예제입니다.
회사내의 워크스테이션에서 작동하는 vLLM서버를 통하여 언어모델을 작동하고, 개인 컴퓨터에서 임베딩 모델이 작동됩니다. 따라서 서버가 닫혀있으면 작동하지 않습니다.

data 폴더에 넣고 싶은 파일(pdf, word, txt, .md)를 넣고 document_embedding.py를 실행하면 서버위에 data 폴더 안에 있는 파일의 텍스트를 모두 임베딩합니다. 이 리포에는 예제로 텍스트 파일이 들어가 있습니다.

맨 예제 코드 파일 맨 아랫줄에

response = query_engine.query("what did author do?")
print(response)

이것만 수정하면 원하는 질의응답을 할 수 있습니다.

현재 Llama Index의 가장 기본적인 프롬프트 엔지니어링을 사용하고 있습니다. 따라서 대답이 단답형이고 딱딱합니다.

현재 작동중인 모델은 Mixtral 8x7B - instruct v0.1 이며 AWQ 4INT 양자화를 사용하였습니다. 한국어를 지원하지만 때때로 영어로 대답하는 문제가 있으며 이는 LLAMA INDEX상에서 기본적으로 하고 있는 프롬프트 엔지니어링이 모두 영어여서 그럴 수 있습니다. (이 문제는 gpt3.5-turbo도 가지고 있는 문제입니다) 또한 Mixtral의 근본적인 한계로써, 영단어의 한국어 음차를 이상하게 읽는 문제가 있습니다.

document_load_embedding.py 는 새로 임베딩을 실행하지 않고 vector db에서 임베딩을 불러와 검색하는 방식입니다.

 작동 구조 - embedding#

기본 설정된 embedding 파일은 한국어를 지원합니다. 다음의 모델을 사용합니다. sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

 작동 구조 - Vector DB#

vector db 는 postgresql 의 extension인 pgvector를 사용하고 있으며 이에 대한 설치 방법은 해당 프로젝트 github 페이지를 참고하시길 부탁드립니다. GitHub - pgvector/pgvector: Open-source vector similarity search for Postgres
데이터 베이스 접속 설정은 db_config.json을 수정하여 진행합니다.

실행시마다 설정된 table을 리셋하고 새로 임베딩을 빌드하는 방식이기 때문에 파일이 많으면 오래 걸릴 수 있습니다. vector db의 특성상 하나의 새로운 element를 삽입하면 search tree를 다시 지어야 하기 때문에 프로토타입단계에서는 이것으로 마무리하고, 차후 Cassnadra 5.X로 넘어가 고도화 할 계획입니다. (PG vector는 임베딩을 relational 처럼 다룰 수 있게 하지만, 이때문에 상당한 성능의 희생을 감수하였습니다. 그래도 문서 몇천장정도는 이상이 없을것으로 보입니다.)

# 원하는 언어모델 실행

만약 ChatGPT 등 api서비스를 통하여 강력한 언어모델의 사용을 원한다면

  1. 해당되는 서비스의 API키를 받아

  2. LLAMA INDEX 에서 제공하는 바인딩을 이용해 코드 수정

만약 언어 모델을 직접 구동하고 싶다면

  1. 듀얼 부팅등을 통하여 Linux 시스템을 구축하거나 WSL(Windows Subsystem Linux)을 활용하여 최신 리눅스 환경을 구축하고 (윈도우에서 구동하는건 제가 아무것도 못 도와드려요.)

  2. LLAMA CPP를 컴파일하고 (다음의 프로젝트 참조)

    GitHub - ggerganov/llama.cpp: Port of Facebook's LLaMA model in C/C++

    GitHub - abetlen/llama-cpp-python: Python bindings for llama.cpp

  3. Hugging Face에서 호환되는 언어모델 가중치(GUFF)를 받아 LLAMA INDEX에서 제공하는 함수를 통해 코드 수정

실행 예시 답변 (프롬프트 전체 노출)#

형광밑줄 설명
파란색 : 프롬프트 엔지니어링
노란색 : 프롬프트에 삽입된 임베딩 검색 결과
붉은색 : 유저 입력
보라색 : LLM 출력

Context information is below.
---------------------
file_path: data/paul_graham/paul_graham_essay.txt

But Interleaf still had a few years to live yet. [5]

Interleaf had done something pretty bold. Inspired by Emacs, they'd added a scripting language, and even made the scripting language a dialect of Lisp. Now they wanted a Lisp hacker to write things in it. This was the closest thing I've had to a normal job, and I hereby apologize to my boss and coworkers, because I was a bad employee. Their Lisp was the thinnest icing on a giant C cake, and since I didn't know C and didn't want to learn it, I never understood most of the software. Plus I was terribly irresponsible. This was back when a programming job meant showing up every day during certain working hours. That seemed unnatural to me, and on this point the rest of the world is coming around to my way of thinking, but at the time it caused a lot of friction. Toward the end of the year I spent much of my time surreptitiously working on On Lisp, which I had by this time gotten a contract to publish.

The good part was that I got paid huge amounts of money, especially by art student standards. In Florence, after paying my part of the rent, my budget for everything else had been $7 a day. Now I was getting paid more than 4 times that every hour, even when I was just sitting in a meeting. By living cheaply I not only managed to save enough to go back to RISD, but also paid off my college loans.

I learned some useful things at Interleaf, though they were mostly about what not to do. I learned that it's better for technology companies to be run by product people than sales people (though sales is a real skill and people who are good at it are really good at it), that it leads to bugs when code is edited by too many people, that cheap office space is no bargain if it's depressing, that planned meetings are inferior to corridor conversations, that big, bureaucratic customers are a dangerous source of money, and that there's not much overlap between conventional office hours and the optimal time for hacking, or conventional offices and the optimal place for it.

But the most important thing I learned, and which I used in both Viaweb and Y Combinator, is that the low end eats the high end: that it's good to be the "entry level" option, even though that will be less prestigious, because if you're not, someone else will be, and will squash you against the ceiling. Which in turn means that prestige is a danger sign.

When I left to go back to RISD the next fall, I arranged to do freelance work for the group that did projects for customers, and this was how I survived for the next several years. When I came back to visit for a project later on, someone told me about a new thing called HTML, which was, as he described it, a derivative of SGML. Markup language enthusiasts were an occupational hazard at Interleaf and I ignored him, but this HTML thing later became a big part of my life.

In the fall of 1992 I moved back to Providence to continue at RISD. The foundation had merely been intro stuff, and the Accademia had been a (very civilized) joke. Now I was going to see what real art school was like. But alas it was more like the Accademia than not. Better organized, certainly, and a lot more expensive, but it was now becoming clear that art school did not bear the same relationship to art that medical school bore to medicine. At least not the painting department. The textile department, which my next door neighbor belonged to, seemed to be pretty rigorous. No doubt illustration and architecture were too. But painting was post-rigorous. Painting students were supposed to express themselves, which to the more worldly ones meant to try to cook up some sort of distinctive signature style.

A signature style is the visual equivalent of what in show business is known as a "schtick": something that immediately identifies the work as yours and no one else's. For example, when you see a painting that looks like a certain kind of cartoon, you know it's by Roy Lichtenstein. So if you see a big painting of this type hanging in the apartment of a hedge fund manager, you know he paid millions of dollars for it. That's not always why artists have a signature style, but it's usually why buyers pay a lot for such work. [6]

There were plenty of earnest students too: kids who "could draw" in high school, and now had come to what was supposed to be the best art school in the country, to learn to draw even better.

file_path: data/paul_graham/paul_graham_essay.txt

Meanwhile I'd been hearing more and more about this new thing called the World Wide Web. Robert Morris showed it to me when I visited him in Cambridge, where he was now in grad school at Harvard. It seemed to me that the web would be a big deal. I'd seen what graphical user interfaces had done for the popularity of microcomputers. It seemed like the web would do the same for the internet.

If I wanted to get rich, here was the next train leaving the station. I was right about that part. What I got wrong was the idea. I decided we should start a company to put art galleries online. I can't honestly say, after reading so many Y Combinator applications, that this was the worst startup idea ever, but it was up there. Art galleries didn't want to be online, and still don't, not the fancy ones. That's not how they sell. I wrote some software to generate web sites for galleries, and Robert wrote some to resize images and set up an http server to serve the pages. Then we tried to sign up galleries. To call this a difficult sale would be an understatement. It was difficult to give away. A few galleries let us make sites for them for free, but none paid us.

Then some online stores started to appear, and I realized that except for the order buttons they were identical to the sites we'd been generating for galleries. This impressive-sounding thing called an "internet storefront" was something we already knew how to build.

So in the summer of 1995, after I submitted the camera-ready copy of ANSI Common Lisp to the publishers, we started trying to write software to build online stores. At first this was going to be normal desktop software, which in those days meant Windows software. That was an alarming prospect, because neither of us knew how to write Windows software or wanted to learn. We lived in the Unix world. But we decided we'd at least try writing a prototype store builder on Unix. Robert wrote a shopping cart, and I wrote a new site generator for stores — in Lisp, of course.

We were working out of Robert's apartment in Cambridge. His roommate was away for big chunks of time, during which I got to sleep in his room. For some reason there was no bed frame or sheets, just a mattress on the floor. One morning as I was lying on this mattress I had an idea that made me sit up like a capital L. What if we ran the software on the server, and let users control it by clicking on links? Then we'd never have to write anything to run on users' computers. We could generate the sites on the same server we'd serve them from. Users wouldn't need anything more than a browser.

This kind of software, known as a web app, is common now, but at the time it wasn't clear that it was even possible. To find out, we decided to try making a version of our store builder that you could control through the browser. A couple days later, on August 12, we had one that worked. The UI was horrible, but it proved you could build a whole store through the browser, without any client software or typing anything into the command line on the server.

Now we felt like we were really onto something. I had visions of a whole new generation of software working this way. You wouldn't need versions, or ports, or any of that crap. At Interleaf there had been a whole group called Release Engineering that seemed to be at least as big as the group that actually wrote the software. Now you could just update the software right on the server.

We started a new company we called Viaweb, after the fact that our software worked via the web, and we got $10,000 in seed funding from Idelle's husband Julian. In return for that and doing the initial legal work and giving us business advice, we gave him 10% of the company. Ten years later this deal became the model for Y Combinator's. We knew founders needed something like this, because we'd needed it ourselves.

At this stage I had a negative net worth, because the thousand dollars or so I had in the bank was more than counterbalanced by what I owed the government in taxes. (Had I diligently set aside the proper proportion of the money I'd made consulting for Interleaf? No, I had not.) So although Robert had his graduate student stipend, I needed that seed funding to live on.

We originally hoped to launch in September, but we got more ambitious about the software as we worked on it.
---------------------
Given the context information and not prior knowledge, answer the query.
Query:
저자 그래이엄은 어느 대학교를 갔어? 고유명사는 그대로 적어주었으면 좋겠고 약어는 풀어서 한 번 설명했으면 해.
Answer: 저자 그래이엄은 로드 아일랜드 디자인 사무학교(Rhode Island School of Design, RISD)에서 그림 과정을 이수했습니다. 이 대학교는 미국에서 그림 교육이 유명한 대학교 중 하나입니다. 그래이엄은 여기서 그림 기술을 배우기 위해 다녔으며, 이후 컴퓨터 그래픽스 분야에서 활동했습니다.