logo
down
shadow

How to declare a global dynamic array with C/OpenACC with PGI compiler


How to declare a global dynamic array with C/OpenACC with PGI compiler

Content Index :

How to declare a global dynamic array with C/OpenACC with PGI compiler
Tag : c , By : Harvey
Date : January 12 2021, 08:33 AM

Does that help When you use "declare" with a pointer, you are creating a global device pointer but not the array that pointer points to. Hence when you try and update the array it doesn't exist and why the runtime errors.
To fix, you also need to add the array to a data region such as the "enter data" directive as I show below. When you put the array in a data region, besides creating space for the array, the runtime will then go back and "attach" it to "A", i.e. fill in the device copy of "A" with correct device pointer value.
% cat header.h

#include <stdio.h>
#include <stdlib.h>

extern int *A;
#pragma acc declare create(A)

% cat header.c
#include <header.h>
int *A;

% cat test.c
#include "header.h"
int main(int argc, char* argv[]){
        printf("main() start\n");
        int sum=0;
        int N=0;
        if(argc==1){
          printf("usage: ./main.exe N");
        }else{
          N=atoi(argv[1]);
        }
        printf("N =%d\n", N);
        A=(int*)malloc(N*sizeof(int));
        #pragma acc enter data create(A[0:N])

        for(int i=0;i<N;i++){A[i]=i;}
        printf("almost data region\n");
        #pragma acc data copy(sum)
        {
             printf("inside data region\n");
             #pragma acc update device(A[0:N])
             #pragma acc parallel loop present(A) reduction(+:sum)
             for(int i=0;i<N;i++){
                sum+=A[i];
             }
        }
        printf("sum = %d\n",sum);
        #pragma acc exit data delete(A)
        free(A);
        exit(0);
    }
% pgcc -I./ test.c header.c -ta=tesla:cc60 -Minfo=accel
test.c:
main:
     13, Generating enter data create(A[:N])
     17, Generating copy(sum)
     21, Generating update device(A[:N])
         Accelerator kernel generated
         Generating Tesla code
         21, Generating reduction(+:sum)
         22, #pragma acc loop gang, vector(128) /* blockIdx.x threadIdx.x */
     27, Generating exit data delete(A[:1])
header.c:
% setenv PGI_ACC_TIME 1
% a.out 1024
main() start
N =1024
almost data region
inside data region
sum = 523776

Accelerator Kernel Timing data
test.c
  main  NVIDIA  devicenum=0
    time(us): 124
    13: upload reached 1 time
        13: data copyin transfers: 1
             device time(us): total=33 max=33 min=33 avg=33
    13: data region reached 1 time
        13: data copyin transfers: 1
             device time(us): total=9 max=9 min=9 avg=9
    17: data region reached 2 times
        17: data copyin transfers: 1
             device time(us): total=33 max=33 min=33 avg=33
        26: data copyout transfers: 1
             device time(us): total=22 max=22 min=22 avg=22
    21: update directive reached 1 time
        21: data copyin transfers: 1
             device time(us): total=10 max=10 min=10 avg=10
    21: compute region reached 1 time
        21: kernel launched 1 time
            grid: [8]  block: [128]
             device time(us): total=4 max=4 min=4 avg=4
            elapsed time(us): total=589 max=589 min=589 avg=589
        21: reduction kernel launched 1 time
            grid: [1]  block: [256]
             device time(us): total=4 max=4 min=4 avg=4
            elapsed time(us): total=27 max=27 min=27 avg=27
    27: data region reached 1 time
        27: data copyin transfers: 1
             device time(us): total=9 max=9 min=9 avg=9

Comments
No Comments Right Now !

Boards Message :
You Must Login Or Sign Up to Add Your Comments .

Share : facebook icon twitter icon

How to declare dynamic properties with Google Closure Compiler?


Tag : javascript , By : semicolonth
Date : March 29 2020, 07:55 AM
Any of those help This falls under the Restrictions for ADVANCED_OPTIMIZATIONS of the documentation. You must consistently refer to properties using either the dotted notation or quoted syntax. When you mix access, the compiler may rename the dotted access, but will not touch the quoted syntax and thus generate incorrect code.

How to declare global variable for google closure compiler?


Tag : javascript , By : Manu
Date : March 29 2020, 07:55 AM
may help you . As Rohan points out in the comments, the Closure compiler thinks of window.myGlobal and myGlobal as different things, even though you and I know they are actually the same. If you need to define it inside a function, you could do something like
var myGlobal;
(function() { myGlobal = 42; })();
console.log(myGlobal);
window.myGlobal = null;
(function() { window.myGlobal = 42; })();
console.log(window.myGlobal);

Can I use printf(or something) in OpenACC with PGI compiler?


Tag : debugging , By : Sinisa Ruzin
Date : March 29 2020, 07:55 AM
This might help you Posting the answer from the above comments for completeness.
Building with -g and setting LD_LIBRARY_PATH to point to the toolkit directory /linux86-64/lib resolved this issue.

How to perform manual deep copy of 2D dynamic array of struct in C using OpenACC


Tag : c , By : nabbed
Date : March 29 2020, 07:55 AM
this one helps. Close. The only thing I'd do different is to not create "domain.bucket" and update the bucket's count so the device has this information. Also, since updates/copies are shallow, be sure to only update the list array or scalars in the structs. Otherwise you may overwrite device/host pointers. Here's an example. While I'm using Linux, other than the executable name, the code should the same.
% cat test.c

#include <stdio.h>
#include <stdlib.h>

typedef struct{
  int *list;  // it is list of particles in a given bucket
  int  count; // it is the total number of particles in the bucket
} structBucket;


typedef struct{
structBucket  **bucket;
int    numberOfBuckets[2]; // number of buckets in x- and y- dimensions
} structDomain;

#define XDIM 64
#define YDIM 64

int main() {

  structDomain domain;
  int iX,iY, capacity;

// Allocate memory for **bucket
  domain.numberOfBuckets[XDIM] = 10; domain.numberOfBuckets[YDIM] = 5;

  domain.bucket = (structBucket**)malloc( sizeof(structBucket*) * domain.numberOfBuckets[XDIM] );

   for (iX=0 ; iX < domain.numberOfBuckets[XDIM] ; iX++)
      domain.bucket[iX] = (structBucket*)malloc( sizeof(structBucket) * domain.numberOfBuckets[YDIM]);


// Calculate domain.bucket[iX][iY].count here using some logic
  for (iX = 0; iX < domain.numberOfBuckets[XDIM]; iX++)
  {
    for (iY = 0; iY < domain.numberOfBuckets[YDIM]; iY++)
    {
       domain.bucket[iX][iY].count = iX*domain.numberOfBuckets[YDIM]+iY;
  }}
#pragma acc enter data copyin(domain)
#pragma acc enter data create(domain.bucket[:domain.numberOfBuckets[XDIM]][:domain.numberOfBuckets[YDIM]])
// Allocate memory for *list
  for (iX = 0; iX < domain.numberOfBuckets[XDIM]; iX++)
  {
    for (iY = 0; iY < domain.numberOfBuckets[YDIM]; iY++)
    {
        capacity = domain.bucket[iX][iY].count;
#pragma acc update device(domain.bucket[iX][iY].count)
        if (capacity > 0)
        {
          domain.bucket[iX][iY].list = (int *)malloc(sizeof(int) * capacity);
#pragma acc enter data create(domain.bucket[iX][iY].list[:capacity])
        }
    }
  }

#pragma acc parallel loop gang collapse(2) present(domain)
  for (iX = 0; iX < domain.numberOfBuckets[XDIM]; iX++)
  {
    for (iY = 0; iY < domain.numberOfBuckets[YDIM]; iY++)
    {
        capacity = domain.bucket[iX][iY].count;
        if (capacity > 0) {
#pragma acc loop vector
           for (int i = 0; i < capacity; ++i) {
                domain.bucket[iX][iY].list[i] = i;
           }
        }
   }}

  for (iX = 0; iX < 5; iX++)
  {
    for (iY = 0; iY < 5; iY++)
    {
        capacity = domain.bucket[iX][iY].count;
        if (capacity > 0) {
#pragma acc update host(domain.bucket[iX][iY].list[:capacity])
           printf("iX=%d iY=%d Cnt=%d\n\t",iX,iY,capacity);
           for (int i = 0; i < capacity; ++i) {
                printf("%d ",domain.bucket[iX][iY].list[i]);
           }
           printf("\n");
        }
   }}

  exit(0);
}
% pgcc test.c -ta=tesla -Minfo=accel -V19.4
main:
     40, Generating enter data copyin(domain)
     41, Generating enter data create(domain.bucket[:domain.numberOfBuckets][:domain.numberOfBuckets])
     49, Generating update device(domain.bucket->->count)
     52, Generating enter data create(domain.bucket->->list[:capacity])
     57, Generating present(domain)
         Generating Tesla code
         58, #pragma acc loop gang collapse(2) /* blockIdx.x */
         60,   /* blockIdx.x collapsed */
         65, #pragma acc loop vector(128) /* threadIdx.x */
     65, Accelerator restriction: size of the GPU copy of domain.bucket is unknown
         Loop is parallelizable
     78, Generating update self(domain.bucket->->list[:capacity])
% a.out
iX=0 iY=1 Cnt=1
        0
iX=0 iY=2 Cnt=2
        0 1
iX=0 iY=3 Cnt=3
        0 1 2
iX=0 iY=4 Cnt=4
        0 1 2 3
iX=1 iY=0 Cnt=5
        0 1 2 3 4
iX=1 iY=1 Cnt=6
        0 1 2 3 4 5
iX=1 iY=2 Cnt=7
        0 1 2 3 4 5 6
iX=1 iY=3 Cnt=8
        0 1 2 3 4 5 6 7
iX=1 iY=4 Cnt=9
        0 1 2 3 4 5 6 7 8
iX=2 iY=0 Cnt=10
        0 1 2 3 4 5 6 7 8 9
iX=2 iY=1 Cnt=11
        0 1 2 3 4 5 6 7 8 9 10
iX=2 iY=2 Cnt=12
        0 1 2 3 4 5 6 7 8 9 10 11
iX=2 iY=3 Cnt=13
        0 1 2 3 4 5 6 7 8 9 10 11 12
iX=2 iY=4 Cnt=14
        0 1 2 3 4 5 6 7 8 9 10 11 12 13
iX=3 iY=0 Cnt=15
        0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
iX=3 iY=1 Cnt=16
        0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
iX=3 iY=2 Cnt=17
        0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
iX=3 iY=3 Cnt=18
        0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
iX=3 iY=4 Cnt=19
        0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
iX=4 iY=0 Cnt=20
        0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
iX=4 iY=1 Cnt=21
        0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
iX=4 iY=2 Cnt=22
        0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
iX=4 iY=3 Cnt=23
        0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
iX=4 iY=4 Cnt=24
        0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

OpenACC compiler: How to download and use CAPS compiler


Tag : development , By : George H.
Date : March 29 2020, 07:55 AM
Related Posts Related QUESTIONS :
  • Problems with pointer into structure
  • Why does this C program print output "10" irrespective of the for loop?
  • taking output 2D array in matrix form
  • sizeof char pointer and pointer to pointer
  • does the following program cause memory leak?
  • Innacurate file readings from fopen and/or fscanf
  • Searching an element in an Array using Recursive Function in C Language
  • Most fastest C code to count recursively directories in Linux ( without files )
  • Why does C not offer syntactically transparent references like C++ and Java do?
  • Code doesn't get excuted after using continue in while loop
  • How can I maintain correlation between structure definitions and their construction / destruction code?
  • Avoid race conditions when using pointers and threads
  • Binary and Decimal converting
  • How to create input tensors and use with interpreter in Tensorflow Lite (experimental C API)?
  • Unexpected typecasting between values in C
  • Trouble with Forking Process and Calling bc using execve
  • Glib Threads vs GMain Loop Eventing
  • Why does the byte sequence turn when I cast a char array to an integer array?
  • Is there any difference usage in external interrupt between GPIO (AHB bus)and those (APB bus)?
  • Trouble programming AVR to interpret input from Arduino rotary encoder module
  • Is it correct to use a do-while loop inside a for loop? Why and why not?
  • Why we can't use dot for new created pointers to structs
  • Atomicity of fprintf from MPI processes
  • Printing of negative value in c via printf
  • What's the difference between global or local variables regarding the main function?
  • movsd from memory to xmm0 in c x86-64 jit
  • Problem with a function that insert the content of a csv into an array of struct
  • Segmentation Fault running time on sem_post(flag)
  • Extracting values from an incoming bluetooth serial on an arduino
  • "How much memory space does an array takes if the maximum size that is declared is not used?"
  • C GTK2 frustrated with gnome documentation
  • What really happens when a dynamic memory allocation is explicitly converted to struct type?
  • Re-Indexing Bits Within a Char
  • pointer de-referencing balagurusamy
  • Is it safe to memcpy regex_t?
  • Find a tight upper bound on complexity of the below program:
  • FFTW results differ from FFT in MATLAB
  • How to fix Misra 2012 violation , " Assignment operation in expression "
  • What is the difference between "int *p =0;" and "int *p; *p=0;"
  • Strncpy gives unwanted characters at end of string
  • What is really happening behind when a constant is assigned to a pointer variable?
  • Having trouble tracking logic of program includes Fork()
  • Are leftshift operators dependent on register size?
  • How to pass and receive back pointer to array of structures?
  • How to convert 2 bytes into a signed short in C
  • Using while loop to print Ascii Table with 10 characters per line
  • Is it possible to compress binary files with Huffman-encoding?
  • Why I am getting missing terminating " character?
  • Function call problem in data structure learning
  • Getting a segmentation fault trying to pass a node
  • Copying chars from file into an array
  • How to measure the time in seconds between two characters while user inserting them as an input
  • How to print two strings on the same line in c
  • Passing ("text"+1) argument to Strlen function in C. Why is Output 3?
  • to find the total no of letter in a string. please checkout my code too
  • Why do hiredis functions use void* instead of redisReply*?
  • Is there any case where the C static keyword should or could be used in header files for variables?
  • Sort an array in the relative order of elements of another array in c
  • Why functional programming over c
  • Re-assign value to variable if rand() repeats a number
  • shadow
    Privacy Policy - Terms - Contact Us © scrbit.com