바쁨 대기, 배율, 신호량, 조건 변수 소개: 추정π의 예(병렬 계산, Pthread 라이브러리로)

먼저 hellowerld 프로그램으로 여러 개의 라인을 만들고 문장을 인쇄합니다. 주로 이 세 가지 함수를 설명하고자 합니다.
int pthread_create(pthread_t* thread,const pthread_attr_t* attr,void* fuc,void* arg);//루틴 생성 및 실행 함수 연결
int pthread_join(pthread_t* thread, void **retval);//다른 라인이 끝날 때까지 기다립니다. 이곳의 끝은 메모리 공간의 방출을 의미합니다
free(pthread t*)//프로그램이 할당한 공간을 수동으로 방출합니다. 예를 들어malloc ()
다음은 코드(gcc -g - Wall -o hello world heloworld.c -lpthread;./helloworld)입니다.

#include
#include
#include
/*

   int pthread_create(pthread_t* thread,const pthread_attr_t* attr,void* fuc,void* arg);

   int pthread_join(pthread_t* thread, void **retval);

*/

//thread's num
int thread_count;

void *Hello(void* rank);

int main(int argc,char* argv[]){
	//Use long in case of 64-bit system
	long thread;
	pthread_t* thread_handles;

	//Get number of threads from command line
	thread_count = strtol(argv[1],NULL,10);
	thread_handles = malloc(thread_count*sizeof(pthread_t));

	for(thread = 0;thread < thread_count;thread++){
        //Create threads
		pthread_create(&thread_handles[thread],NULL,Hello,(void*)thread);
	}
	printf("Hello from the main thread
");

	for(thread = 0;thread < thread_count;thread++){
        //Wait util thread_handles[thread] complete
		pthread_join(thread_handles[thread],NULL);
	}
	free(thread_handles);
	return 0;
}//main

void *Hello(void *rank){
	long my_rank=(long)rank;
	printf("Hello from thread %ld of %d
",my_rank,thread_count);
	return NULL;
}//Hello

기본 코드(Hello 함수는 당신이 실행하고 싶은 함수로 대체할 수 있음)에 익숙해진 후에 병렬 방법으로π의 크기를 추정해 보세요.
공식은 다음과 같다. pi=4(1-1/3+1/5-1/7+...+((-1)^n)*(1/(2n+1))+...)
우리는 여러 개의 라인으로 이 계산들을 처리하고, 각 라인은 일부분을 처리한다.모든 계산을 더하면π의 값이다.
이것은 하나의 문제와 관련이 있다. 변수sum를 전역 변수로 가정하면 여러 개의 라인이 그것(임계구)을 수정할 때 덮어쓰는 상황이 발생할 수 있다.
다음은 내가 각각 바쁘게 기다리고, 서로 배척량, 신호량으로 임계구 문제를 해결한다.
/
대기 중:

#include
#include
#include
/*
   pi=4(1-1/3+1/5-1/7+...+((-1)^n)*(1/(2n+1))+...)

   critical_sections problem
   method:Busy-Waiting
       but it will continually use the CPU accomplishing nothing
*/

int thread_count;            //thread's num
int n = 1000000;             //10^6
double sum = 0.0;
int flag = 0;                

void *Thread_sum(void* rank);

int main(int argc,char* argv[]){
	//Use long in case of 64-bit system
	long thread;
	pthread_t* thread_handles;	
	//Get number of threads from command line
	thread_count = strtol(argv[1],NULL,10);
	thread_handles = malloc(thread_count*sizeof(pthread_t));

	for(thread = 0;thread < thread_count;thread++){
        //Create threads
		pthread_create(&thread_handles[thread],NULL,Thread_sum,(void*)thread);
	}

	printf("Hello from the main thread
");

	for(thread = 0;thread < thread_count;thread++){
        //Wait util thread_handles[thread] complete
		pthread_join(thread_handles[thread],NULL);

	}

	free(thread_handles);
	printf("%f",4*sum);
	return 0;
}//main

void *Thread_sum(void *rank){
	long my_rank=(long)rank;
	double factor,my_sum = 0.0;
	long long i;
	long long my_n = n/thread_count;
	long long my_first_i = my_n*my_rank;
	long long my_last_i = my_first_i + my_n;

	if(my_first_i % 2 == 0)
		factor = 1.0;
	else
        factor = -1.0;

	for(i = my_first_i;i < my_last_i;i++,factor = -factor){
		my_sum += factor/(2*i+1);
	}

	//Use Busy-Waiting to solve critical sections after loop
	while(flag != my_rank);
	sum += my_sum;
	flag = (flag+1) % thread_count;
	return NULL;
}//Thread_sum

단점:
1. 그러나 이런 방법은 지속적으로 CPU 자원을 차지하고(while을 사용했기 때문에) 덧붙이는 순서는 번호에 따른다.
2. 컴파일러 최적화를 켜면 신뢰할 수 없을 수도 있습니다
/
상호 배율:

#include

#include

#include

/*

   pi=4(1-1/3+1/5-1/7+...+((-1)^n)*(1/(2n+1))+...)



   critical_sections problem

   method:Mutexes

*/



int thread_count;            //thread's num

int n = 1000000;             //10^6

double sum = 0.0;

int flag = 0;                

pthread_mutex_t mutex;



void *Thread_sum(void* rank);



int main(int argc,char* argv[]){

	//Use long in case of 64-bit system

	long thread;

	pthread_t* thread_handles;

	

	//Get number of threads from command line

	thread_count = strtol(argv[1],NULL,10);

	thread_handles = malloc(thread_count*sizeof(pthread_t));

	//initialize Mutex

	pthread_mutex_init(&mutex,NULL);



	for(thread = 0;thread < thread_count;thread++){

        	//Create threads

		pthread_create(&thread_handles[thread],NULL,Thread_sum,(void*)thread);

	}

	printf("Hello from the main thread
");

	for(thread = 0;thread < thread_count;thread++){

                //Wait util thread_handles[thread] complete

		pthread_join(thread_handles[thread],NULL);

	}



	free(thread_handles);

	pthread_mutex_destroy(&mutex);	

	printf("%f",4*sum);

	return 0;

}//main



void *Thread_sum(void *rank){

	long my_rank=(long)rank;

	double factor,my_sum = 0.0;

	long long i;

	long long my_n = n/thread_count;

	long long my_first_i = my_n*my_rank;

	long long my_last_i = my_first_i + my_n;



	if(my_first_i % 2 == 0)

		factor = 1.0;

	else

		factor = -1.0;

	for(i = my_first_i;i < my_last_i;i++,factor = -factor){

		my_sum += factor/(2*i+1);

	}



	//Use Mutexes to solve critical sections after loop

	pthread_mutex_lock(&mutex);

	sum += my_sum;

	pthread_mutex_unlock(&mutex);



	return NULL;

}//Thread_sum

이 예에서는 임계 구역에 잠금 (pthread mutex lock () 과 잠금 해제 (pthread mutex unlock () 를 통해 매번 한 라인만 임계 구역에 접근할 수 있도록 제한합니다.그래서 상호 배척량은 관건적인 부분에 대한 충돌 방문을 피할 수 있다
///
신호량:

#include

#include

#include

#include

/*

   pi=4(1-1/3+1/5-1/7+...+((-1)^n)*(1/(2n+1))+...)



   critical_sections problem

   method:semaphore

*/



int thread_count;            //thread's num

int n = 1000000;             //10^6

double sum = 0.0;

int flag = 0;                

sem_t sem;



void *Thread_sum(void* rank);



int main(int argc,char* argv[]){

	//Use long in case of 64-bit system

	long thread;

	pthread_t* thread_handles;

	

	//Get number of threads from command line

	thread_count = strtol(argv[1],NULL,10);

	thread_handles = malloc(thread_count*sizeof(pthread_t));

	//initialize semaphore

	sem_init(&sem,0,1);



	for(thread = 0;thread < thread_count;thread++){

        	//Create threads

		pthread_create(&thread_handles[thread],NULL,Thread_sum,(void*)thread);

	}

	printf("Hello from the main thread
");

	for(thread = 0;thread < thread_count;thread++){

                //Wait util thread_handles[thread] complete

		pthread_join(thread_handles[thread],NULL);

	}



	free(thread_handles);

	sem_destroy(&sem);	

	printf("%f",4*sum);

	return 0;

}//main



void *Thread_sum(void *rank){

	long my_rank=(long)rank;

	double factor,my_sum = 0.0;

	long long i;

	long long my_n = n/thread_count;

	long long my_first_i = my_n*my_rank;

	long long my_last_i = my_first_i + my_n;



	if(my_first_i % 2 == 0)

		factor = 1.0;

	else

		factor = -1.0;

	for(i = my_first_i;i < my_last_i;i++,factor = -factor){

		my_sum += factor/(2*i+1);

	}



	//Use semaphore to solve critical sections after loop

	sem_wait(&sem);

	sum += my_sum;

	sem_post(&sem);



	return NULL;

}//Thread_sum

여기서sem=1을 초기화하여 첫 번째 임계구역에 접근할 라인에서sem 를 실행합니다wait(sem),sem를 1로 줄이고 임계 구역에 접근할 수 있습니다.다른 스레드는semwait(sem)시sem=0으로 인해 기다리는 상태입니다.
장점: 신호량은 상호 배척량보다 강하다. 왜냐하면 신호량은 어떠한 비음의 값으로도 초기화할 수 있기 때문이다.
/
barrier는 프로그램의 한 점입니다. 이 점에서 모든 라인이 도착할 때까지 라인이 막힙니다.다중 스레드 동기화 문제로도 이해할 수 있다.
다음은 상호 배율과 조건 변수를 결합하여 이 과정을 시뮬레이션합니다.

#include

#include

#include

/*

   pi=4(1-1/3+1/5-1/7+...+((-1)^n)*(1/(2n+1))+...)



   critical_sections problem

   method:Mutexes

   barrier problem

   method:condition variables

*/



int thread_count;            //thread's num

int n = 1000000;             //10^6

double sum = 0.0;

int flag = 0;

int count = 0;         //Use it to judge whether all of threads arrive barrier                

pthread_mutex_t mutex;

pthread_cond_t cond_var;



void *Thread_sum(void* rank);



int main(int argc,char* argv[]){

	//Use long in case of 64-bit system

	long thread;

	pthread_t* thread_handles;

	

	//Get number of threads from command line

	thread_count = strtol(argv[1],NULL,10);

	thread_handles = malloc(thread_count*sizeof(pthread_t));

	//initialize Mutex

	pthread_mutex_init(&mutex,NULL);



	for(thread = 0;thread < thread_count;thread++){

        	//Create threads

		pthread_create(&thread_handles[thread],NULL,Thread_sum,(void*)thread);

	}

	printf("Hello from the main thread
");

	for(thread = 0;thread < thread_count;thread++){

                //Wait util thread_handles[thread] complete

		pthread_join(thread_handles[thread],NULL);

	}



	free(thread_handles);

	pthread_mutex_destroy(&mutex);	

	printf("%f",4*sum);

	return 0;

}//main



void *Thread_sum(void *rank){

	long my_rank=(long)rank;

	double factor,my_sum = 0.0;

	long long i;

	long long my_n = n/thread_count;

	long long my_first_i = my_n*my_rank;

	long long my_last_i = my_first_i + my_n;



	if(my_first_i % 2 == 0)

		factor = 1.0;

	else

		factor = -1.0;

	for(i = my_first_i;i < my_last_i;i++,factor = -factor){

		my_sum += factor/(2*i+1);

	}



	//Use Mutexes to solve critical sections after loop

	//Use condition variables to solve barrier problem

	pthread_mutex_lock(&mutex);

	sum += my_sum;

	count++;

	if(count == thread_count){

		count = 0;

		pthread_cond_broadcast(&cond_var);

		printf("%ld(the last thread) has arrive at barrier
",my_rank);

	}else{

		while(pthread_cond_wait(&cond_var,&mutex) != 0);

		printf("%ld wake up
",my_rank);

	}

	pthread_mutex_unlock(&mutex);



	return NULL;

}//Thread_sum

여기 pthreadcond_wait()가 실행하는 메커니즘은 다음과 같습니다.
1, 대기 대기열에 라인을 놓고 잠금 해제
2, 대기 pthreadcond_signal 또는 pthreadcond_broadcast 신호를 보내고 경쟁 자물쇠를 채워요.
3. 서로 반발할 정도로 경쟁하면 자물쇠를 채운다.
이렇게 하면 임계구 변수count(계수기)를 수정하여barrier 문제를 해결할 수 있다.

이 내용에 흥미가 있습니까?

현재 기사가 여러분의 문제를 해결하지 못하는 경우 AI 엔진은 머신러닝 분석(스마트 모델이 방금 만들어져 부정확한 경우가 있을 수 있음)을 통해 가장 유사한 기사를 추천합니다:

OpenCL 병렬 계산 3 - 호스트 프로그래밍 데이터 전송 및 구분

캐시 대상과 이미지 대상의 데이터 구조는cl 입니다.mem.캐시 개체를 생성할 때 매개변수 설명: clCreateBuffer(cl context context, cl mem flags options, size t s...

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

CC BY-SA 2.5, CC BY-SA 3.0 및 CC BY-SA 4.0에 따라 라이센스가 부여됩니다.

바쁨 대기, 배율, 신호량, 조건 변수 소개: 추정π의 예(병렬 계산, Pthread 라이브러리로)

좋은 웹페이지 즐겨찾기