This thesis presents and evaluates a directory enhanced network on chip for FPGA, with the goal of improving the performance of cores generated by FCUDA, a translation tool enabling CUDA code to be run on FGPAs. NoCs are an inherently scalable platform, as aggregate system bandwidth increases with the number of nodes in the system. This work enhances an existing NoC to include a directory protocol capable of tracking the location of on-chip data stored in core-local BRAMs. By tracking the location of on-chip data, requests that would normally be satisfied by o↵-chip memory can be fulfilled by on-chip sources, allowing performance gains. Simulation results show a directory-enhanced NoC gains of up to 40% in speed over an ordinary NoC for some applications. In addition, simulation and synthesis results show potential for increased overall application performance over a bus-based system, despite the significant area overhead of NoC routers.