Commit Graph

  • 23aadd4734
    APSS24 apss24 Morgan 2024-09-03 20:31:05 +0900
  • 0f63360f03 WIP on APSS24 Project: - Add CHECK_CUDA Macro and cudaDeviceSync - Set cudaSetDevice(0) in main.cpp master Jaehwan Lee 2024-08-20 21:04:33 +0900
  • c64f7a5615 Update APSS24 Project: - Rename half in half_cpu to avoid conflict with cuda_fp16 half - Replace Tanh with LeakyReLU for example GPU kernel Jaehwan Lee 2024-08-19 22:07:29 +0900
  • 080ce2ddbe WIP APSS24 Advanced: Cast mean and var to Float in Batchnorm2d Jaehwan Lee 2024-08-18 17:10:23 +0900
  • e5b7ac846a WIP on APSS24 Advanced Project Jaehwan Lee 2024-08-18 16:49:31 +0900
  • db5bf517a5 Initial release of APSS24 Project Jaehwan Lee 2024-08-18 16:45:18 +0900
  • af6671f41d [24.06.10] Updates: - Fix validate function - Increase tolerance ratio to 0.0005 Jaehwan Lee 2024-06-10 20:28:29 +0900
  • 056771a2a5 Updates 2024.05.12 - Update answer.bin - Modify main.cpp, run.sh - Remove README - Migrate tools to final-project-tools/ Jaehwan Lee 2024-05-20 11:06:16 +0900
  • f619c57f1a . sota-junsik 2024-05-05 15:48:01 +0900
  • 70606591da Merge branch 'master' of https://github.com/kjp4155/chundoong-lab-ta sota-junsik 2024-05-05 15:25:35 +0900
  • a3bec209d7 . sota-junsik 2024-05-05 15:24:56 +0900
  • fad28371ea Merge branch 'master' of https://github.com/kjp4155/chundoong-lab-ta Jinpyo Kim 2024-05-05 15:14:18 +0900
  • 30a1f0e903 Clenaup some code Jinpyo Kim 2024-05-05 15:13:23 +0900
  • eb80365789 Modify bin2text (print_output) script Jaehwan Lee 2024-05-05 05:59:58 +0000
  • fe256d1aff
    Merge pull request #5 from kjp4155/byeonghyun-shpc24-final Jinpyo Kim 2024-05-05 14:58:47 +0900
  • 88913bceab
    Merge pull request #4 from kjp4155/mod Jinpyo Kim 2024-05-05 14:58:10 +0900
  • 1396426971 modify model param path mod gyeongjecho 2024-05-05 14:46:31 +0900
  • d6600fa13d modify data gyeongjecho 2024-05-05 14:42:06 +0900
  • 238069b494 Add new data (input 16 token, output 8 token) bpe gyeongjecho 2024-05-05 14:18:57 +0900
  • 167e915b58 Modify bpe format and Convert main input & output format gyeongjecho 2024-05-05 05:00:06 +0900
  • e83927ee68 Modify Pretokenizing gyeongjecho 2024-05-04 17:52:06 +0000
  • b5a242be9d Add BPE Tokenizer gyeongjecho 2024-05-04 15:10:06 +0000
  • e86386ac8c add example for CUDA porting (element-wise add) byeonghyun-shpc24-final devko 2024-05-04 16:53:57 +0900
  • 0f69ecb9db count nan value in validate() not just kick out devko 2024-05-04 16:27:52 +0900
  • 4ab473e385 add script for obj directory creation devko 2024-05-04 16:26:30 +0900
  • eeda1c6339 WIP SHPC2024 project Jinpyo Kim 2024-05-04 12:21:11 +0900
  • 138400d102 WIP SHPC2024 project Jinpyo Kim 2024-05-04 12:19:59 +0900
  • 8efdea580d WIP SHPC2024 project Jinpyo Kim 2024-05-04 11:40:24 +0900
  • 584fb68862 WIP SHPC2024 project refactoring Jinpyo Kim 2024-05-02 15:58:31 +0900
  • 7b254040f6 [Refactor] minor fix Jaehwan Lee 2024-04-30 09:47:07 +0000
  • 205831df32 [Refactor] Move tensor codes to model.cu Jaehwan Lee 2024-04-27 16:54:16 +0000
  • 9814f196cb [Refactor] modify generate_tokens() in model.cu Jaehwan Lee 2024-04-27 16:00:10 +0000
  • c20814b81b [Refactor] refactor & modify operations and comments Jaehwan Lee 2024-04-27 15:12:18 +0000
  • 3a11f82ac1 [Refactor] rename activations Jaehwan Lee 2024-04-27 07:47:01 +0000
  • 54768cbe60 Typedef Tensor into Parameter or Activation for readability Jaehwan Lee 2024-04-27 07:27:27 +0000
  • 9c7b7e23f5 Minor fix Jaehwan Lee 2024-04-27 07:14:26 +0000
  • d929051e57 Modify hparam names for better readability 2 Jaehwan Lee 2024-04-27 07:12:57 +0000
  • 298de954b6 Modify hparam names for better readability Jaehwan Lee 2024-04-27 07:03:05 +0000
  • cdf5a70f3e minor fix print_output Jaehwan Lee 2024-04-24 07:45:28 +0000
  • bf16f2f575 update print_output script Jaehwan Lee 2024-04-24 06:24:04 +0000
  • c09d772693 move output array allocation to root process Jaehwan Lee 2024-04-24 05:51:44 +0000
  • 756e22af92 minor fix Jaehwan Lee 2024-04-23 15:09:54 +0000
  • 4beb68a2a8 remove warmup Jaehwan Lee 2024-04-22 10:00:56 +0000
  • 62612cbcfb Only root rank mpi process print logs Jaehwan Lee 2024-04-22 09:57:11 +0000
  • 9094d76a14 Initial release Jaehwan Lee 2024-04-20 07:27:49 +0000
  • 65d1af072a
    Create 확장형 조교 매뉴얼 junsik_shin 2024-03-28 13:42:18 +0900
  • f70d6b5fa3 commit shpc24 sota-junsik 2024-03-19 16:19:43 +0900
  • f4563dc9b5
    Update pinned_memory.cu junsik_shin 2024-02-22 13:49:14 +0900
  • 452e4bf444 add APWS24 bhko 2024-02-15 14:53:30 +0000
  • 2a75aedec8 change aligned_allc to cudaMallocHost bhko 2024-01-30 02:33:50 +0000
  • a8c8df52f1
    Update util.cpp devBHKo 2024-01-29 14:32:20 +0900
  • b73336c43a fix bug in casting fp16 when calling rand() bhko 2024-01-29 02:26:24 +0000
  • e4ca444f08 change conv_im2col_fp16 output datatype which cause overflow at large size of input, fp16 @ fp16 => fp32 not fp16 bhko 2024-01-29 02:12:50 +0000
  • 04a9610a7c add reduction example for GETP24 bhko 2024-01-24 02:33:04 +0000
  • aead5e2a1b
    Update advanced_9_3_im2col_convolution_with_TC.cu devBHKo 2024-01-24 09:49:46 +0900
  • 4aea2a4ec6
    Update advanced_9_2_im2col_convolution_without_TC.cu devBHKo 2024-01-24 09:49:28 +0900
  • 80a79a5f29
    Update advanced_9_2_im2col_convolution_with_TC.cu devBHKo 2024-01-24 09:48:41 +0900
  • e27fc72e16 SHPC2023-Fall Upload sota-junsik 2024-01-15 22:37:25 +0900
  • b94d0976ec implement hw junsik-vietnam sota-junsik 2023-11-27 14:43:22 +0000
  • 1f7386f179 jyp checkitout sota-junsik 2023-11-19 23:15:36 +0900
  • b38db61cc2 revised minor stuffs sota-junsik 2023-11-13 14:49:17 +0900
  • 54b98d063f add layernorm sota-junsik 2023-11-07 23:40:46 +0900
  • 68af7fd014 include order sota-junsik 2023-11-06 19:28:44 +0900
  • cfa50a51c6 adjust clang-format sota-junsik 2023-11-06 19:26:42 +0900
  • 45c10f5a21 ... sota-junsik 2023-11-06 19:21:56 +0900
  • 5edfb7581a implement finalproject shpc23-fall sota-junsik 2023-11-06 19:11:41 +0900
  • 0e7596cfdc update the last version of the project file bhko 2023-08-29 10:30:20 +0000
  • d0cc07ff23 add sequential output Chaewon Kim 2023-08-22 01:35:36 +0000
  • 7f14333afb fix conv2d param Chaewon Kim 2023-08-22 01:35:07 +0000
  • 34f1df2a9c revised minor errors sota-junsik 2023-08-21 05:35:26 +0000
  • 17f01f58f3 fix path in help_function; change extern pointer to extern array; change default input batch to 1 bhko 2023-08-19 13:35:05 +0000
  • 4b46c9e305 merge project code from MCRL apss23_project repo bhko 2023-08-18 07:01:05 +0000
  • 33aaa7e59e rename Heehoon Kim 2023-08-18 05:13:18 +0000
  • 6f45fdcd5a advanced 12 1 Heehoon Kim 2023-08-18 04:52:59 +0000
  • 2d59745ce4 advanced 7 1 answer Heehoon Kim 2023-08-17 05:37:54 +0000
  • 9ea8e99f95 WIP APSS23 Jinpyo Kim 2023-08-16 21:11:16 +0000
  • 5369d60f31 WIP APSS23 Jinpyo Kim 2023-08-15 21:19:48 +0000
  • fe46ababde Merge branch 'master' of https://github.com/kjp4155/chundoong-lab-ta Jinpyo Kim 2023-08-15 17:23:08 +0000
  • 7147340f9d WIP APSS23 Jinpyo Kim 2023-08-15 17:23:04 +0000
  • 5c7cd68cd3 multi-gpu-comm example Heehoon Kim 2023-08-15 10:23:36 +0000
  • ea9d968314 add TC variants bhko 2023-08-15 09:33:15 +0000
  • d90f62ac21 Merge branch 'master' of https://github.com/kjp4155/chundoong-lab-ta Jinpyo Kim 2023-08-13 19:27:39 +0000
  • 43c46af780 WIP APSS23 Jinpyo Kim 2023-08-13 19:27:36 +0000
  • a26327616a repo link fix bhko 2023-08-13 11:16:08 +0000
  • 71c77a785e add project by chaewon bhko 2023-08-13 11:04:11 +0000
  • 346aaa097a Merge branch 'master' of https://github.com/kjp4155/chundoong-lab-ta bhko 2023-08-13 10:58:35 +0000
  • c0db958843 project bhko 2023-08-13 10:58:25 +0000
  • 25c6f9fa35 Merge branch 'master' of https://github.com/kjp4155/chundoong-lab-ta Jinpyo Kim 2023-08-12 18:44:02 +0000
  • 425e0da22a WIP APSS23 Jinpyo Kim 2023-08-12 18:43:58 +0000
  • 0b3d6a75fc add fp16 version convolution for TC bhko 2023-08-12 15:56:11 +0000
  • 8f56efa6fc Merge branch 'master' of https://github.com/kjp4155/chundoong-lab-ta Jinpyo Kim 2023-08-12 15:29:34 +0000
  • 06321fa4aa WIP APSS23 Jinpyo Kim 2023-08-12 15:29:31 +0000
  • 02c83b324e add standard(?) answer sota-junsik 2023-08-12 15:21:04 +0000
  • b6825d2a78 Merge branch 'master' of https://github.com/kjp4155/chundoong-lab-ta Jinpyo Kim 2023-08-12 10:01:47 +0000
  • a10f7918bb WIP apss23 Jinpyo Kim 2023-08-12 10:01:43 +0000
  • 6dac637428 clang format sota-junsik 2023-08-12 08:39:23 +0000
  • c6a39bf912 revised image-rotation/ sota-junsik 2023-08-12 08:22:36 +0000
  • cd2da5068d revised integral/ sota-junsik 2023-08-12 07:20:29 +0000
  • 08b2d0b5c7 fix typo bhko 2023-08-10 14:45:46 +0000
  • 849dd1009a apply clang formatter bhko 2023-08-10 14:44:18 +0000